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(57) Abstract 

The invention relates to novel human DNA sequences, targeting constructs, and methods for producing novel genes encoding 
tiunmbopoietin, DNase I, and ^-interferon by homologous recombination. The targeting constructs comprise at least (a) a targeting 
sequence- (b) a regulatory sequence; (c) an exon; and (d) a splice-donor site. The targeting constructs, which can undergo homologous 
recombination with endogenous cellular sequences to generate a novel gene, are introduced into cells to produce bomologously recombinant 
cells The homologously recombinant cells are then maintained under conditions which will permit transenption of die novel gene and 
translation of the mRNA produced, resulting in production of either fcrorobopoietin, DNase I, or ^-interferon. The invention further relates 
to methods of producing pharmaceutically useful preparations containing thrombopoietin, DNase I, or 0 -interferon from homologously 
recombinant cells and methods of gene therapy comprising administering homologously recombinant cells producing thrombopoietm, 
DNase I, or ^-interferon to a patient for therapeutic purposes. 
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PROTEIN PRODUCTION AND DELIVERY 

Background of the Invention 

Current approaches to treating disease by administer- 
ing therapeutic proteins include in vitro production of 
5 therapeutic proteins for conventional pharmaceutical deliv- 
ery (e.g. intravenous , subcutaneous, or intramuscular 
injection, or by intranasal or intratracheal aerosol admin- 
istration) and, more recently, gene therapy. 

One protein which may be useful in the treatment of 

10 platelet disorders is thrombopoietin (TPO) . Platelets are 
small (2-3 microns in diameter) anucleated cells which play 
an important role in primary hemostasis by adhering to and 
aggregating at sites of vascular damage. In addition, 
platelets release factors which, are important components of 

15 the blood coagulation, inflammation, and wound healing 
pathways. Patients with very low levels of circulating 
platelets (thrombocytopenia) exhibit bleeding into superfi- 
cial sites (e.g. skin, mucous membranes, genitourinary 
tract, and gastrointestinal tract) as a result of mild 

20 trauma, and are at risk for death from catastrophic hemor- 
rhage occurring spontaneously or resulting from trauma. 
The physiologic role of platelets and the etiology of 
platelet disorders have been described (cf . Hematology: 
Clinical and Laboratory Practice, Eds. R.L. Bick et al. , 

25 pp. 1337-1389, Mosby, St. Louis (1993 ); Harrison 's Princi- 
ples of Internal Medicine, Eds. J.D. Wilson et al . , 11th 
Ed., pp. 1500-1505, McGraw Hill, New York, 1991). 

Thrombocytopenia may be caused by decreased production 
of platelets by the bone marrow, increased sequestration of 

30 platelets in the spleen, or accelerated platelet destruc- 
tion. Decreased production of platelets by the bone marrow 
may result from destruction of hematopoietic precursor 
cells by irradiation or treatment with cytotoxic agents 
during therapy for cancer. In addition, alcohol, 
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estrogens, and thiazide diuretics can suppress platelet 
production (drug- induced thrombocytopenia) . Furthermore, 
infiltration of the bone marrow by malignant cells and the 
disorders congenital amegakaryocytic hypoplasia and throm- 
5 bocytopenia with absent radii (TAR syndrome) can result in 
decreased platelet production. 

Increased splenic sequestration of platelets may occur 
as a result from splenomegaly associated with a variety of 
conditions , including liver disease, infiltration of the 

10 spleen with tumor cells as in myeloproliferative or 
lymphoprolif erative disorders, and Gaucher' s disease. 

Accelerated platelet destruction and thrombocytopenia 
may be caused by vasculitis, hemolytic uremic syndrome, 
disseminated intravascular coagulation, and the presence of 

15 intravascular prosthetic devices such as cardiac valves. 
In addition, certain viral infections, drugs, and autoim- 
mune disorders lead to immunologic thrombocytopenia in 
which platelets become coated with antibody, immune com- 
plexes, or complement and are rapidly cleared from the 

20 circulation. A number of drugs can elicit an immune re- 
sponse leading to immunologic thrombocytopenia, including 
sulf athiazole , novobiocin , para -aminosalicylate , quinidine , 
quinine, carbamazepine , digi toxin, arsenical drugs, and 
methyldopa . 

25 Thrombocytopenia is currently treated most readily by 

transfusion with platelet concentrates, although cortico- 
steroid therapy or plasmapheresis can be effective in 
immunologic thrombocytopenia. Treatment with platelet 
concentrates is severely limited by availability of suit- 

3 0 able donors and the risk of transmission of blood-borne 
infectious diseases. 

As an alternative to transfusion therapy, platelet 
deficiencies could be treated with hematopoietic growth 
factors which promote proliferation and maturation of 

35 megakaryocytes, the nucleated progenitor cells from which 
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platelets are derived. Recently, cDNA clon s were isolated 
which encode the human, mouse, and dog analogs of a protein 
purified from aplastic porcine plasma which displays 
megakaryocytopoietic activity (de Sauvage, F.J. et al. 
5 -Nature 359:533-538 (1994); Lok, S. et al . Nature 369:565-5- 
68 (1994); Bartley, T.D. et al. Cell 77:1117-1124 (1994)). 
The encoded protein, termed thrombopoietin (TPO) , stimu- 
lates proliferation and maturation of megakaryocytes and 
induces platelet production In vivo upon injection into 

10 experimental animals. 

Methods for the production and delivery of other 
proteins with therapeutic properties are desirable. For 
example, it has been demonstrated that recombinant 
S- interferon is an effective medication for treatment of 

15 exacerbations in patients with -relapsing-remitting multiple 
sclerosis (MS; see Kelley, C.L. and Smeltzer, S.C. J . 
Jtfeujroscience Nursing 25:52-56 (1994)). Furthermore, it has 
been reported that £- interferon isolated from non- 
transfected cultured human fibroblasts may be an effective 

20 means for preventing the progression of acute non-A, non-B 
hepatitis to chronic disease (Omata, M. et al . , Lancet 
338:914-915 (1991)). 

As another example, it has been demonstrated that 
recombinant human DNase I is an effective agent for 

25 reducing the viscosity of sputum from cystic fibrosis (CF) 
patients (Shak, S. et al . , Proc. Natl, Acad. Sci . USA 
87:9188-9192 (1990)) and for improving pulmonary function 
and decreasing exacerbations of respiratory disease in CF 
patients (Fuchs, H.J. et al.. New Engl . J\ Med. 331:637-642 

30 (1994)). It has been further suggested that DNase I may be 
effective in improving respiratory function in patients 
with other respiratory diseases, such as chronic bronchitis 
and pneumonia (Shak, S. et al. , op. cit . ) . 

While TPO, S-interf eron, and DNase I are useful, for 

35 example, in the treatment of thrombocytopenia, MS, and CF, 
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respectively, production of therapeutic proteins using 
genetic engineering technology as taught in the prior art 
is limited to conventional recombinant DNA methods, in 
which the recombinant protein is purified from mammalian 
5 cells expressing an exogenous cloned gene or cDNA under the 
control of a suitable promoter. The exogenous DNA encoding 
the protein of interest is introduced into cells in the 
form of a viral vector, circular plasmid DNA, or linear DNA 
fragment. Chinese Hamster Ovary (CHO) cell lines and their 
10 derivatives (Gottesman, M. M. Meth. Enzymol. 152:3-8 (1987) 
or mouse cell lines, such as NSO (Galfre, G. and Milstein, 
C, Meth. Enzymol. 73(B): 3-46 (1981)) or P3X63Ag8.653 
(Kearney, J- et al. J\ Immunol. 123: 1548-1550 (1979)) are 
commonly used, and the production of human therapeutic 
15 proteins is thus accomplished by expression and purifica- 
tion of the protein from a cell of non-human origin. 

In many cases, it is desirable to produce human 
therapeutic proteins in a human cell, for example, when it 
is desired that the glycosylation pattern of the protein be 
20 similar to patterns normally found on human cells. In 

addition, the expression of human proteins in human cells 
is important in the development of gene therapy methods, in 
which a patient's cells are engineered to produce a desired 
therapeutic protein to alleviate the symptoms or cure a 
25 disease. 

Clearly, the development of novel methods for the 
production of these human proteins in human cells would be 
of benefit to patients, through the availability of a wider 
range of products with therapeutic effectiveness. One 

3 0 approach proposed by scientists in the field for 

accomplishing this goal is to use homologous recombination, 
or gene targeting, to introduce a cloned, exogenous 
regulatory element (i.e. a promoter and/or enhancer) into a 
cell's genome at a pre-selected site such that the 

3 5 regulatory element activates expression of a nearby gene, 
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ultimately resulting in production of the protein encoded 
by that g ne. This approach has been suggested in U.S. 
Patent No. 5, 272, 071 and in foreign patent applications 
WO 91/06666, WO 91/06667 and WO 90/11354. 

5 Summary o f the Invention 

Described herein are new methods for producing TPO, 
DNase I, and S-interferon through the generation of novel 
transcription units within a cell's genome, methods which 
differ dramatically from those in the art and represent a 

10 major advance in the ability to manipulate expression in 

mammalian cells. The methods are based on the fact that an 
exogenous regulatory sequence, an exogenous exon, either 
coding or non-coding, and a splice-donor site can be 
introduced into a preselected site in the genome by 

15 homologous recombination. The resulting cells are referred 
to as targeted or homologously recombinant cells. The 
introduced DNA is positioned such that transcripts under 
the control of the exogenous regulatory region include both 
the exogenous exon and endogenous exons present in either 

20 the TPO, DNAse X, or &- interferon genes, resulting in 

transcripts in which the exogenous and endogenous exons are 
operatively linked. The novel transcription units produced 
by homologous recombination allow TPO, DNAse I, or S- inter- 
feron to be produced in human cells using the naturally- 

25 occurring endogenous exons encoding these proteins without 
introducing any portion of the coding sequences of the 
cognate genes. The present invention further relates to 
improved materials and methods for both the in vitro 
production of TPO, S- interferon, and DNase I and for the 

30 production and delivery of TPO, £- interferon, and DNase I 
by gene therapy. 

The methods of the present invention teach the 
production of TPO, £- interferon, or DNase I by gene 
activation, in which the coding DNA sequence of the 
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corresponding protein is not introduced into a cell by 
transfection of exogenous DNA encoding the protein. 
Instead, noncoding sequences upstream of one of th se genes 
or coding or noncoding sequences within the genes are 
5 manipulated by gene targeting to create a novel 

transcription unit which expresses TPO, S- interferon, or 
DNase I . It is a purpose of this invention to define 
sequences upstream of the TPO, S- Interferon, or DNase I 
genes, non-coding sequences (introns and 5' non- trans la ted 

10 sequences) within the human TPO, S- interferon, or DNase I 
genes, and methods for utilizing these sequences for the 
production of TPO, 6- interferon, or DNase I. 

The methods described herein teach production of TPO, 
£- interferon, or DNase I proteins, by the generation of 

15 novel genes in which exogenous 'and endogenous exons are 
operatively linked. As a result of introduction of 
exogenous components into the chromosomal DNA of a cell, 
the expression of the protein encoded by the endogenous 
gene is activated. Other forms of altered gene expression 

20 may be envisioned, such as increasing expression of a gene 
which is expressed in the cell as obtained, changing the 
pattern of regulation or induction such that it is 
different than occurs in the cell as obtained, and reducing 
(including eliminating) expression of a gene which is 

25 expressed in the cell as obtained. For example, it may be 
desirable to perform in vitro protein production or gene 
therapy to produce a protein other than TPO, DNase I, or 
6- interferon using a cell type that naturally produces one 
of these proteins. In these settings, it would be desir- 

30 able to eliminate expression of TPO, DNase I, or 
S- interferon. 

The present invention further relates to DNA 
constructs useful in the method of activation of the TPO, 
^-interferon, or DNase I genes. The DNA constructs 

35 comprise: (a) targeting sequences; (b) a regulatory 
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sequence; (c) an exon ; and (d) an unpaired splice-donor 
site. The targeting sequence in the DNA construct is 
derived from chromosomal DNA lying within and/or upstream 
of the desired gene and directs the integration of elements 
5 (a) - (d) into the chromosomal DNA in a cell such that the 
elements (b) - (d) are operatively linked to sequences of 
the desired endogenous gene . In another embodiment , the 
DNA constructs comprise: (a) a targeting sequence; (b) a 
regulatory sequence, (c) an exon, (d) a splice-donor site, 

10 (e) an intron, and (f) a splice-acceptor site, wherein the 
targeting sequence in the DNA construct is derived from 
chromosomal DNA lying within and/or upstream of the desired 
gene and directs the integration of elements (a) - (f ) such 
that the elements of (b) - (f ) are operatively linked to 

15 the desired endogenous gene. The targeting sequence is 
homologous to the preselected site within or upstream of 
the TPO, &- interferon, or DNase I genes in the cellular 
chromosomal DNA with which homologous recombination is to 
occur. In the construct, the exon is generally 3' of the 

20 regulatory sequence and the splice-donor site is 3' of the 
exon. Constructs of this type are disclosed in pending 
U.S. patent applications U.S. S.N. 07/985,586 and U.S. S.N. 
08/243,391, all of which are incorporated herein by 
reference . 

25 The following serves to illustrate two embodiments of 

the present invention, in which the sequences upstream of 
the TPO gene are altered to allow expression of TPO in 
primary, secondary, or immortalized cells which do not 
express TPO in detectable quantities in their untransf ected 

30 state as obtained. In embodiment 1 (Figure 1) , the 

targeting construct contains two targeting sequences. Both 
the first and second targeting sequences are homologous to 
sequences upstream of the TPO coding region, with the first 
targeting sequence 5' of the second targeting sequence. 

35 The targeting construct also contains a regulatory region, 
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an exon (which in this case, comprises noncoding sequences 
and begins at a CAP site) and an unpaired splice-donor 
site . The homologous r combination event that gen rates 
the novel transcription unit producing TPO is shown in 
5 Figure 1. 

In embodiment 2 (Figure 2) , the targeting construct 
also contains two targeting sequences. The first targeting 
sequence is homologous to sequences upstream of the 
endogenous TPO coding region, and the second targeting 

10 sequence is homologous to the second intron of the TPO 

gene. The targeting construct also contains a regulatory 
region, an exon (in this case a coding exon derived from 
the human growth hormone (hGH) gene) and an unpaired 
splice -donor site. The homologous recombination event that 

15 generates the novel transcription unit producing TPO is 
shown in Figure 2. 

In these two embodiments, the products of the 
targeting events are novel transcription units which 
generate a mature mRNA in which an exogenous exon is 

20 positioned upstream of exon 2 (Embodiment 1) or exon 3 

(Embodiment 2) of the endogenous TPO gene. The product of 
transcription, splicing, translation, and post-transla- 
tional cleavage of the signal peptide is mature TPO. 
Embodiments 1 and 2 differ with respect to the relative 

25 positions of the regulatory sequences of the targeting 
construct that are inserted and the specific pattern of 
splicing that needs to occur to produce the final, 
processed transcript. 

The invention further relates to a method of 

30 producing TPO, S-interf eron, or DNase I in vit;ro or in vivo 
through introduction of a construct as described above into 
host cell chromosomal DNA by homologous recombination to 
produce a homologously recombinant cell. The homologously 
recombinant cell is then maintained under conditions which 
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will permit transcription, translation and secretion of 
TPO, S-interf eron, or DNase I. 

The present invention also relates to cells, such as 
homologously recombinant primary or secondary cells (i.e., 
5 non- immortalized cells) and homologously recombinant 

immortalized cells, useful for producing TPO, IS- interferon, 
or DNase I, methods of making such cells, methods of using 
the cells for in vitro protein production, and methods of 
gene therapy. Homologously recombinant cells of the 

10 present invention are of vertebrate origin, particularly of 
mammalian origin, and even more particularly of human 
origin. Homologously recombinant cells produced by the 
method of the present invention contain exogenous DNA which 
causes the homologously recombinant cells to express a 

15 desired gene at a higher level or with a pattern of regula- 
tion or induction that is different than occurs in the 
corresponding cell that has not undergone homologous 
recombination. 

In one embodiment, the activated TPO, &- Interferon, or 

20 DNase I gene can be further amplified by the inclusion of 
an amplifiable selectable marker gene which has the 
property that cells containing amplified copies of the 
selectable marker gene can be selected for by culturing the 
cells in .the presence of the appropriate selectable agent . 

25 The activated gene is amplified in tandem with the amplifi- 
able selectable marker gene. Cells containing many copies 
of the activated gene are useful for in vitro protein 
production and gene therapy. 

Homologously recombinant cells of the present 

30 invention are useful in a number of applications in humans 
and animals. In one embodiment, the cells can be implanted 
into a human or an animal for protein delivery in the human 
or animal. For example, TPO, DNase I, or S- interferon can 
be delivered systemically or locally in humans for 

35 therapeutic benefit in the treatment of disease (TPO for 



» * 

WO 96/29411 



PCI7US96/03377 



-10- 

thrombocytop nia f DNase I for CF, or S- interferon for the 
treatment of MS) . In addition, homologously recombinant 
non-human cells producing TPO, DNase I, or 6-interf eron of 
non-human origin may be produced, and human or non-human 
5 cells expressing TPO, DNase I, or B- interferon may be 

enclosed within barrier devices and implanted into humans 
or animals for use in a therapy. 

Rrief Descr iption of the Drawings 

Figure 1 is a schematic diagram of a strategy for 
10 transcriptionally activating the TPO gene by the creation 
of a novel transcription unit; thick lines: targeting 
sequences; thin lines: introns and 5' upstream region ; 
cross-hatched box, regulatory sequence; stippled boxes: 
noncoding exon sequences; black boxes: coding exon 
15 sequences; open boxes: splice sites. The splice-donor site 
(SD) of the exogenous exon in the targeting construct and 
the splice -acceptor site (SA) flanking TPO exon 2 which is 
involved in splicing to the exogenous exon are indicated. 
Figure 2 is a schematic diagram of a strategy for 
20 transcriptionally activating the TPO gene by the creation 
of a novel transcription unit; thick lines: targeting 
sequences; thin lines: intron 1 and 5' upstream regions- 
cross -hatched box: regulatory sequence; stippled boxes: 
noncoding exon sequences; black boxes: coding exon 
25 sequences; open boxes, splice sites. The splice-donor site 
(SD) of the exogenous exon in the targeting construct and 
the splice-acceptor site (SA) flanking TPO exon 3 which is 
involved in splicing to the exogenous exon are indicated. 
Figure 3 presents the 6,943 bp genomic Xbal fragment 
30 encompassing the 5' flanking region and exons 1, 2, and 3 
of the human thrombopoietin (TPO) gene. The Xbal fragment 
is depicted by the solid line, while exons 1, 2, and 3 are 
represented by the solid boxes. The nucleotide positions 
of the Apal, BainHI, Hindlll, EcdRl, NotI, Sfil and Xbal 



WO 96/29411 PCT/US96/03377 

-11- 

recognition sequences are indicated, Nucleotid s are 
number d starting at the hTPO ATG initiation codon. 

Figures 4A-4D present the nucleotide sequence of 
4,488 bp of genomic DNA (SEQ ID NO: 3) from the human TPO 
5 locus lying 5' to the known cDNA sequence (de Sauvage et 
al., op. cit.). Nucleotide numbers are noted at the 
beginning of each line* Numbering is based on the ATG 
initiation codon at position 1 (see Figures 5A-5B) . 
Ambiguities in the nucleotide sequence are represented 

10 using the following code: R = A or G (purine); H = A, C, or 
T; V = A, C, or G; N_~ A, C, G, or T; K « G or T ; S = G or 
C; W = A or T. The recognition sites for Apal , BairiHI, 
Hindlll, NotT, Sfll and Xbal and their corresponding 
nucleotide positions are indicated above the sequence. 

15 Figures 5A-5B present the .nucleotide sequence of 

2,455_bp of genomic DNA (SEQ ID NO: 4) from the human TPO 
locus extending downstream from the position of the 5' end 
of the known cDNA sequence (de Sauvage et al., op. cit.) . 
Nucleotide numbers are noted at the beginning of each line. 

20 Numbering is based on the ATG initiation codon at 

position 1. Shown are exon 1, intron 1, exon 2, intron 2, 
exon 3, and a portion of intron 3. Exons 1, 2, and 3 are 
underlined, and the coding portions of exons 2 and 3 are 
noted as underlined triplets. The intron-exon boundaries 

25 are deduced from the published cDNA sequence (de Sauvage et 
al., op. cit.). The recognition sites for Apal, EcoRI, and 
Xbal and their corresponding nucleotide positions are 
indicated above the sequence. 

Figure 6 is a schematic diagram of the strategy for 

30 activating the human TPO gene using targeting construct 

pTPOl as described in Example 2. The positions of the dhfr 
and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 

35 CMV promoter; stippled boxes: noncoding exon sequences; 
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black boxes: coding exon sequences; open boxes, splice 
sites. The splice-donor site (SD) of the exogenous xon in 
the targeting construct and the splice-acceptor site (SA) 
flanking TPO exon 3 which is involved in splicing to the 
5 exogenous exon are indicated. Recognition sites for BairHI 
(B) , NotI (N) , Clal (C) , Xhol (X) , and Xbal which are 
relevant to the construction of the targeting construct are 
marked . 

Figure 7 is a schematic diagram of the strategy for 

10 activating the human TPO gene using targeting construct 

pTP02 as described in Example 2. The positions of the dhfr 
and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 

15 CMV promoter; heavily stippled boxes: noncoding exons from 
the CMV IE gene; lightly stippled boxes: noncoding exon 
sequences of TPO exons 1 and 2; black boxes: coding exon 
sequences of TPO exons 2 and 3; open boxes: splice sites. 
The splice-donor (SD) and splice -acceptor (SA) sites 

20 flanking the noncoding exons in the targeting construct and 
the splice-acceptor site (SA) flanking TPO exon 2 which is 
involved in splicing to the unpaired splice-donor site of 
the 3' exogenous exon are indicated. Recognition sites for 
BamHI (B) , Hindlll (H) , Notl (N) , Clal (C) , Sail (S) , EcoRI 

25 (R) , and Xbal which are relevant to the construction of the 
targeting construct are marked. 

Figure 8 is a schematic diagram of the strategy for 
activating the human TPO gene using targeting construct 
pTP03 as described in Example 2. The positions of the dhfr 

3 0 and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 
CMV promoter; stippled boxes: noncoding exon sequences of 
TPO exons 1 and 2; black boxes: coding exon sequences (the 

35 coding exon corresponding to hGH exon 1 in the targeting 
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construct and in the novel transcription unit is marked) ; 
open boxes: splice sites. The splice-donor site (SD) of 
the exogenous exon in the targeting construct and the 
splice -acceptor site (SA) flanking TPO exon 3 which is 
5 involved in splicing to the exogenous exon are indicated. 
Recognition sites for BairiHI (B) , Hindlll (H) , Clal (C) , 
Xhol (X) , EcdRI (R) , and Xbal which are relevant to the 
construction of the targeting construct are marked. 

Figure 9 is a diagrammatic representation of the 

10 approximately 8 kb Hindi fragment encompassing the 5' 
flanking region, exons 1 and 2, and the sequences down- 
stream of exon 2 of the human DNase I gene. The Hindi 
fragment is depicted by the solid line, while exons 1 and 2 
are represented by solid rectangular boxes. The nucleotide 

15 positions of the Apal, Banal, Hindi, EepI, SphI and Smal 
recognition sequences are indicated. Nucleotides are 
numbered starting at the AUG initiation codon. The 
nucleotide positions which reside upstream of exon 2 are 
based on the DNA sequence presented in Figures 10 and 11. 

20 Figures 10A-10D present the nucleotide sequence 

encompassing 4,042 bp of DNA (SEQ ID NO: 17) from the human 
DNase I locus lying 5' to the known cDNA sequence (Shak, S. 
et al. op. cit.) . Nucleotides numbers are noted at the 
beginning of each line. Numbering is based on the ATG 

25 initiation codon at position 1 (see Figure 11) . The 
recognition sites, and the corresponding nucleotide 
positions for Apal, BairiHI, Hindi, Bspl, and SphI are 
indicated above the sequence. 

Figure 11 presents the nucleotide sequence of 810 bp 

30 of DNA (SEQ ID NO: 18) from the human DNase I locus 

extending downstream from the position of the 5' end of the 
known cDNA sequence (Shak, S. et al . op. cit.) . Shown are 
exon 1, intron 1, and a portion of exon 2. Exon 1 and 2 
sequences are underlined and the coding sequences are noted 

35 as underlined triplets. The positions of the putative CAP 
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site and the AUG initiation codon are indicated. The 
intron-exon boundaries ar deduced from the published cDNA 
sequence (Shak S. et al., op. cit.). 

Figure 12 shows a strategy for activation of the human 
5 DNase I gene by homologous recombination. The targeting 
fragment is a 4633 bp BamHI fragment from pDNasel which 
contains; 283 bp of 5' targeting sequence from position 
-1162 (BairiHI site) to -860 (Apal site) , an amplifiable dhfr 
expression unit, neo gene, CMV IE promoter, a CAP site, a 

10 non-codon exon, an unpaired splice-donor site and 363 bp of 
3' targeting sequence from position -860 (Espl site) to 
-468 (BairiHI site) . The dhfr expression unit and the neo 
gene are depicted by open arrows, the orientation of the 
arrows represent the direction of transcription. The 

15 positions of the CMV promoter, TATA box, CAP site and 

splice donor sequence (SD) are indicated. Activation of 
the DNase J gene is achieved by integration of the 
targeting fragment into the genome of the recipient cells 
by homologous recombination. The targeted gene product is 

20 depicted in the lower panel of the figure. The mRNA 

precursor which includes a non-coding 5' exon, a chimeric 
intron and exon 2 of the DNase gene, is represented by the 
thin arrow. 

Figure 13 is a diagrammatic representation of 9,939 bp 
25 encompassing the 5' flanking region, coding sequence and 

the 3' untranslated region of the human ^-interferon gene. 

The 5' and 3' flanking regions are depicted by the solid 

line and the transcribed region is represented by the solid 

box. The nucleotide positions of the Ball, Bgrlll, EcoRI and 
30 PvuII recognition sequences are indicated. Nucleotides are 

numbered starting at the fi- interferon ATG translational 

initiation codon (see Figure 15) . 

Figures 14A-14G present the nucleotide sequence of 

8,355 bp of DNA (SEQ ID NO: 23) from the human ^-interferon 
35 locus lying 5' to the known sequence (GenBank HUMIFNB1F) . 
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Nucleotide numbers are noted at the beginning of each line. 
Numbering is based on the ATG initiation codon at position 
1 (see Figures 15) . The recognition sites for Bgrlll, EcoRI 
and PvuII and their corresponding nucleotide positions are 
5 indicated above the sequence. 

Figures 15A-15B present the nucleotide sequence of 
1,584 bp of DNA (SEQ ID NO: 24) from the human £- interferon 
locus extending downstream from the 5' end of the known 
sequence (GenBank HUMIFNB1F) . Nucleotide numbers are noted 

10 at the beginning of each line. Numbering is based on the 
ATG initiation codon at position 1. The transcribed region 
is underlined and the coding sequences are noted as under- 
lined triplets. The position of. the CAP site and AUG 
initiation codon are indicated. The recognition sites for 

15 Ball, Bglll and PvuII and their corresponding nucleotide 
positions are indicated above the sequence. 

Figure 16 depicts the strategy for activation of the 
human £- interferon gene by homologous recombination using 
targeting construct pIFNb-1 as described in Example 7. The 

20 positions of the TATA box, CAP site, dhfr and neo markers, 
the exogenous CMV promoter, and the ^-interferon 5' flank- 
ing region and coding sequence are indicated. Thick lines: 
targeting sequences; thin lines: intron, S- interferon 5' 
and 3' non-coding sequences; solid box: CMV promoter ; 

25 shaded box: endogenous S-interferon transcribed region ,- 

cross -hatched box: non-coding CMV exon 1 and the chimeric 
exon 2. The splice-donor site (SD) of the exogenous exon 
and the splice -acceptor site (SA) flanking the chimeric 
exon 2 are indicated. Recognition sites for BairiHI, BcoRI, 

30 Hindi, Ndel and PvuII which are relevant to the 

construction of the targeting construct are marked. 



Detailed Description of the Invention 

The present invention as set forth above, relates to a 
method of expressing TPO, DNase I, or fi- interferon in human 
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10 



cells by activation of the endogenous TPO, DNase J, or 
JS- interferon genes. In the present invention, homologous 
recombination is used to insert a regulatory region, an 
exon, and a splice-donor site upstream of endogenous exons 
coding for TPO # DNase I, or IS- interferon, generating novel 
transcription units which are active in the homologously 
recombinant cell produced. The present invention further 
relates to homologously recombinant cells produced by the 
present method and to uses of the homologously recombinant 
cells. In a related embodiment, an activated TPO, DNase J, 
or B-interferon gene is amplified subsequent to activation, 
thus allowing enhanced expression of the activated gene. 

The invention is based upon the discovery that the 
regulation or activity of endogenous genes of interest in a 
15 cell can be altered by creating a novel gene, in which the 
transcription product of the gene combines exogenous and 
endogenous exons and is under the control of an exogenous 
promoter. The method is practiced by inserting into a 
cell's genome, at a preselected site, through homologous 
20 recombination, DNA constructs comprising: (a) one or more 
targeting sequences; <b) a regulatory sequence; (c) an exon 
and (d) an unpaired splice-donor site, wherein the target- 
ing sequence or sequences are derived from chromosomal DNA 
within and/or upstream of a desired endogenous gene and 
25 directs the integration of elements (a) - (d) such that the 
elements (b) - (d) are operatively linked to the endogenous 
gene. In another embodiment, the DNA constructs comprise: 
(a) one or more targeting sequences, (b) a regulatory 
sequence, (c) an exon, (d) a splice-donor site, (e) an 
30 intron, and (f) a splice -acceptor site, wherein the target- 
ing sequence or sequences are derived from chromosomal DNA 
within and/or upstream of a desired endogenous gene and 
directs the integration of elements (a) - (f) such that the 
elements of (b) - (f ) are operatively linked to the first 
3 5 exon of the endogenous gene. 
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The present invention relates particularly to novel 
DNA sequences that cam be used in the construction of 
targeting constructs. Non-coding genomic DNA sequences 
within and upstream of the transcribed regions of the TPO 
5 and DNa.se I genes, and upstream of the transcribed region 
of the S- interferon gene, were cloned and are described for 
the first time. These sequences or DNA fragments compris- 
ing these sequences may be used as targeting sequences in 
DNA constructs useful for gene activation by homologous 

10 recombination. Typically, a targeting sequence is at least 
about 20 base pairs in length. The size of the sequence is 
chosen to be a size which selectively promotes homologous 
recombination with desired genomic DNA sequences. 

Analysis of the genomic DNA sequences and comparison 

15 to the known cDNA sequences revealed features essential for 
the construction of targeting constructs. For example, for 
the first time, it is shown that the first exon of the 
human TPO gene is entirely non-coding, and that translation 
initiates within the second exon of the endogenous gene. 

20 This information was important to the design of the gene 
activation constructs described herein, in which splicing 
of an exogenous exon to the endogenous second exon requires 
that the exogenous exon be non-coding, or in which splicing 
of an exogenous coding exon requires that targeting be 

25 performed such that the exogenous coding exon is inserted 
in a position so that it can be spliced to the endogenous 
third exon of the TPO gene. Furthermore, the cloning of 
approximately 6.3 kb of DNA sequence from upstream of the 
human TPO gene provided targeting sequences useful for the 

3 0 development of gene activation constructs. Figure 4 shows 
approximately 4.5 kb of novel DNA sequence from the human 
TPO locus lying 5' of the known cDNA sequence (de Sauvage, 
F. J. et al., op. cit.). Figure 5 shows approximately 
2 . 5 kb of DNA sequence from the human TPO locus extending 

35 in the 3' direction from the 5' boundary of the known cDNA 
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sequence. Intron sequences (positions -1815 to -145, 
positions 14 to 245, and positions 374 to 570) of Figure 5 
are novel. DNA constructs comprising the novel sequences 
of Figures 4 and 5, or fragments derived from these 
5 sequences, are useful for homologous recombination as 

taught herein. 

Similarly, for the first time it is shown that the 
first exon of the human DNase I gene is entirely non- 
coding. This information was important to the design of 
10 the targeting constructs described herein. Example 5, for 
example, describes a targeting construct which includes two 
non-coding exons separated by an intron, and which is 
inserted upstream of DNase I exon 1. This configuration 
allows promoter position to be optimized by varying the 
15 length of either the exogenous intron or the intron present 
between the exogenous exon and the endogenous second exon 
of the DNase I gene, while ensuring that the primary 
transcript will be spliced appropriately and that 
translation initiates at the correct position for synthesis 
20 of functional DNase I. Furthermore, the cloning of 

approximately 4.5 kb of DNA sequence from upstream of the 
human DNase I gene provided targeting sequences useful for 
the development of gene activation constructs. Figure 10 
shows approximately 4 kb of novel DNA sequence from the 
25 human DNase I locus lying 5' of the known cDNA sequence 

(Shak, S. et al. op. cit.). Figure 11 shows approximately 
0.8 kb of DNA sequence from the human DNase I locus 
extending in the 3' direction from the 5' boundary of the 
known cDNA sequence. Intron sequences (positions -328 to 
30 -2) of Figure 11 are novel. DNA constructs comprising the 
novel sequences of Figures 10 and 11, or fragments derived 
from these sequences, are useful for homologous 
recombination as described herein. 

Finally, the analysis of the upstream region of the 
35 B-interferon gene (a gene which is known to lack introns) 
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was cloned and sequenced and a detailed restriction map was 
produced. Previously , only 357 bp of DNA upstream of the 
translation initiation codon was characterized (see Genbank 
entry HUMIFNB1F) . The cloning and sequence analysis 
5 provided approximately 9.6 kb of genomic DNA upstream of 
the gene for the design and construction of a targeting 
construct (Example 7). Figure 14 shows approximately 
8.4 kb of novel DNA sequence from the B-±nterferon locus 
lying 5' of the known sequences (Genbank entry HUMIFNB1F) . 

10 DNA constructs comprising the novel sequences of Figure 14, 
or fragments derived from these sequences, are useful for 
homologous recombination as taught herein* 

The following defines the DNA constructs of the 
present invention, the elements comprising the DNA 

15 constructs of the present invention (Section A) , methods in 
which the DNA constructs are used to produce homologously 
recombinant cells (Section B) , the structure of the 
targeted gene and the resulting product (Section C) , the 
homologously recombinant cells produced (Section D) , uses 

20 of these cells (Sections E and F) , and the advantages of 
the constructs and methods described herein (Section G) . 

A. The DNA Construct 

The DNA constructs of the present invention include at 
least the following components: a targeting sequence; a 

25 regulatory sequence; an exon and a splice-donor site. In 
the construct, the exon is 3 ' of the regulatory sequence 
and the splice-donor site is 3' of the exon. In addition, 
there can be multiple exons and/or introns preceding (5' 
to) the exon flanked by the splice-donor site. Taken as a 

30 group, the exons, introns, and splice-sites are referred to 
as the "structural elements" of the construct,, so-called 
because they are important in defining the structure of the 
novel gene produced by homologous recombination between 
genomic DNA and DNA of the targeting construct. As 
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described herein, there fr quently ar additional construct 
components , such as a sel ctable and/or amplifiable 
markers . 

The DNA in the construct is referred to as exogenous 
5 DNA, defined herein as DNA which is introduced into a cell 
by the methods described herein, such as with the DNA 
constructs of the present invention. Exogenous DNA can 
contain sequences identical to or different from the 
endogenous DNA. The term endogenous DNA is defined herein 
10 as DNA present in the cell as obtained. 

The DNA of the construct can be obtained from sources 
in which it occurs in nature or can be produced, using 
genetic engineering techniques or synthetic processes. 

1. The Targeting Sequence 
15 The targeting sequence or sequences are DNA sequences 

which permit homologous recombination into the genome of 
the selected cell containing the gene of interest. 
Targeting sequences are, generally, DNA sequences which are 
homologous to (i.e., identical or sufficiently similar to) 

2 0 DNA sequences present in the genome of the cells as 

obtained (e.g., coding or noncoding DNA, located upstream 
of the transcriptional start site, within the transcribed 
region encompassing the gene, or downstream of the 
transcriptional stop site of the gene, or sequences present 
25 in the genome through a previous modification) , such that 
the targeting sequence and cellular DNA can undergo 
homologous recombination. In general, two sequences are 
described as homologous if a DNA strand of one sequence is 
capable of hybridizing to a DNA strand of the other 

3 0 sequence under conditions standardly used for the detection 

of sequence similarity (see, for example, Ausubel et al . , 
Current Protocols in Molecular Biology, Wiley, New York, 
NY. (1987)). The targeting sequence or sequences used are 
selected with reference to the site into which the DNA in 
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the DNA construct is to be inserted and may be derived from 
either genomic or cDNA sequences. Typically, a targeting 
sequence is at least about 20 base pairs in length. The 
size of the sequence is chosen to be a size which 
5 selectively promotes homologous recombination with desired 
genomic DNA sequences. 

One or more targeting sequences can be employed. For 
example, a circular plasmid or DNA fragment preferably 
employs a single targeting sequence. A linear plasmid or 

10 DNA fragment preferably employs two targeting sequences 
with exogenous DNA to be inserted into genome positioned 
between -the two targeting sequences. The targeting 
sequence or sequences can be within an endogenous gene 
(e.g., within the sequences of an exon and/or intron) , 

15 within the endogenous promoter sequences, or upstream of 
the endogenous promoter sequences. The targeting sequence 
or sequences can include those regions of a gene presently 
known or sequenced and/or regions further upstream which 
are structurally uncharacterized but can be mapped using 

20 restriction enzymes and cloning approaches available to one 
skilled in the art. 

2. The Regulatory Sequence 

The regulatory sequence of the DNA construct can be 
comprised of one or more of a variety of elements, 

25 including: promoters (such as a constitutive or inducible 
promoters) , enhancers, scaffold-attachment regions or 
matrix attachment regions, (McKnight, R.A. et al . , Proc. 
Natl. Acad. Scl. USA 55:6943-6947 (1992); Phi-Van, L. and 
Stratling, W.H. EMBO J. 7:655-664 (1988)) negative 

3 0 regulatory elements, locus control region, (Pondel, M.D. et 
al., Nucl. Acids Res. 20:237-243 (1992); Li, Q. and 
Stamatoyannopoulos, G. Blood 54:1399-1401 (1994)) 
transcription factor binding sites, or combinations of said 
sequences . 
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3 . Structural Elements of the DNA Construct 

a. Exons and Introns 
An exon is defined herein as a DNA sequence which is 
copied into RNA and is present in a mature mRNA molecule . 
5 An intron is defined as a sequence of one or more 

nucleotides lying between two exons and which is removed, 
by splicing, from a precursor RNA molecule in the formation 
of an mRNA molecule . 

The DNA constructs of the present invention contain 
10 one or more exons. The exons can, optionally, contain DNA 
which encodes one or more amino acids and/or partially 
encodes an amino acid {i.e., one or two bases of a codon) . 
Where the exogenous exon or exons encode one or more amino 
acids and/or a portion of an amino acid, the DNA construct 
15 is designed such that, upon transcription and splicing, the 
reading frame is in-frame with the second or subsequent 
exon of the endogenous gene's coding region. As used 
herein, in- frame means that the encoding sequences of, for 
example, a first exon and a second exon when fused, join 
20 together nucleotides in a manner that does not change the 
appropriate reading frame of the portion of the mRNA 
derived from the second exon. 

In the case of activating the TPO and DNase I genes, 
the exogenous exon can, preferably, be derived from any 
25 gene in which the exon includes a CAP site and non-coding 
sequences. Examples would include the first exon of the 
CMV immediate -early gene and follicle stimulating hormone 
(FSH) gene. In the case of S- interferon, whose gene 
contains no natural introns, there are preferably two 
30 exogenous non-coding exons, separated by an intron, in the 
targeting construct . 

b. Splice-Sites 
Introns contained within the mRNA of eukaryotic cells 
are removed through the recognition of signals termed 
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splice-donor and splice -acceptor sites. A splice-donor 
site is a sequence which directs the splicing of one exon 
to another exon. Typically, the first exon lies 5' of the 
second exon, and the splice-donor site overlapping and 
5 flanking the first exon on its 3' side recognizes a 

splice-acceptor site flanking the second exon on the 5' 
side of the second exon. Splice-donor sites have a 
characteristic consensus sequence represented as: 
(A/C) AGGURAGU (where R denotes a purine nucleotide) with 

10 the GU in the fourth and fifth positions being required 
(Jackson, I.J., Nucleic Acids Research 19 z 3715-3798 
(1991)). The first three bases of the splice-donor 
consensus site are the last three bases of the exon. 
Splice -donor sites are functionally defined by their 
.15 ability to effect the appropriate reaction within the mRNA 
splicing pathway. 

An unpaired splice-donor site is defined herein as a 
splice-donor site which is present in a targeting construct 
and is not accompanied in the targeting construct by a 

20 splice-acceptor site positioned 3' to the unpaired 

splice-donor site. Upon homologous recombination between 
the targeting sequences and genomic DNA, the unpaired 
splice-donor site results in splicing to an endogenous 
splice-acceptor site. 

25 A splice -acceptor site is a sequence which, like a 

splice-donor site, directs the splicing of one exon to 
another exon. Acting in conjunction with a splice-donor 
site, the splicing apparatus uses a splice -acceptor site to 
effect the removal of an intron. Splice-acceptor sites 

30 have a characteristic sequence represented as: 

YYYYYYYYYYNYAG , where Y denotes any pyrimidine and N 
denotes any nucleotide (Jackson, I.J., Nucleic Acids 
Research 19:3715-3798 (1991)). 
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c. Marker Genes for Selection and Amplification 
The identification of the targeting event can be 
facilitated by the use of one or more selectable marker 
genes typically contained within the targeting DNA 
construct. The use of both positively and negatively 
selectable markers for identifying targeted events is 
described in related pending applications U.S. S.N. 
08/243,391, U.S. S.N. 07/985 r 586, U.S. S.N. 07/789,188, 
PCT/US93/11704, and PCT/US92/09627 . 

Homologously recombinant cells containing multiple 
copies of the novel transcription units produced by the 
present invention may be isolated by including within the 
targeting DNA construct an amplifiable marker gene which 
has the property that cells containing multiple copies of 
the selectable marker gene can be selected for by culturing 
the cells in the presence of an appropriate selectable 
agent. The novel transcription unit will be amplified in 
' tandem with the amplified selectable marker gene, allowing 
the production of very high levels of the desired protein. 
0 Amplifiable marker genes and their use are described in 

applications U.S. S.N. 08/243,391, U.S. S.N. 07/985,586, and 
PCT/US93/11704. 

In one embodiment the positively selectable marker neo 
is used (derived from the bacterial neomycin 
5 phosphotransferase gene) is used to select for cells which 
have stably incorporated the DNA of the targeting 
construct, and the mouse dhfr {dxhydro folate reductase) 
gene is used to subsequently amplify the novel 
transcription unit present in homologously recombinant 
0 cells. 

d. Additional Elements of the Targeting 
Construct 

As taught herein, gene targeting can be used to insert 
a regulatory sequence within an endogenous gene (e.g., 



WO 96/29411 



PCT7US96/03377 



-25- 

within the sequences of an exon and/or intron) , within the 
endogenous promoter sequences, or upstream of the 
endogenous promoter sequences, with said genes 
corresponding to the endogenous cellular TPO, &- Interferon, 
5 or DNase I gene. Alternatively or additionally, the 

targeting constructs may be designed to include sequences 
which affect the structure or stability of the TPO, 
S- interferon, or DNase I protein or corresponding RNA 
molecule. For example, RNA stability elements, splice 

10 sites, and/or leader sequences of RNA molecules can be 
modified to improve or alter the function, stability, 
and/or translatability of an RNA molecule. Protein 
sequences may also be altered, such as signal sequences, 
active sites, and/or structural sequences for enhancing or 

15 modifying glycosylation, transport, secretion, or 

functional properties of a protein. According to this 
method, introduction of the exogenous DNA results in the 
alteration of the structural or functional properties of 
the expressed proteins or RNA molecules. 

2 0 In one embodiment the method can be used to create 

novel transcription units encoding fusion proteins in which 
structural, enzymatic, or ligand or receptor binding 
protein domains of another protein are fused to TPO , DNase 
I, or S- interferon. In these cases the exogenous coding 
25 DNA contains an ATG translation initiation codon in- frame 
with the coding sequences of the endogenous TPO, DNase I, 
or £- interferon gene. For example, the exogenous DNA can 
encode a sequence which can anchor TPO or DNase I to a 
membrane, a portion of a signal peptide designed to improve 

3 0 cellular secretion, leader sequences, enzymatic regions, 

transmembrane domain regions, co-factor binding regions, or 
other functional regions. 

The DNA construct can also include a bacterial origin 
of replication and bacterial antibiotic resistance markers 
3 5 or other selectable markers, which allow for large-scale 
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plasmid propagation in bacteria or any other suitable 
cloning/host system, 

Transfect ion and Homologous Recombination 
According to the present method, the construct is 
5 introduced into the cell, such as a primary, secondary, or 
immortalized cell, as a single DNA construct, or as 
separate DNA sequences which become incorporated into the 
chromosomal or nuclear DNA of a transfected cell. 

The targeting DNA construct can be introduced into 
10 cells on a single DNA construct or on separate constructs. 
The total length of the DNA construct will vary according 
to the number of components and the length of each and the 
construct will generally be at least about 200 nucleotides. 
Further, the DNA can be introduced as linear, double- 
15 stranded (with or without single -stranded regions at one or 
both ends) , single -stranded, or circular DNA. 

Any of the construct types of the disclosed invention 
is then introduced into the cell to obtain a transfected 
cell. The transfected cell is maintained under conditions 
20 which permit homologous recombination, as is known in the 
art (reviewed in Capecchi, M.R., Science 244:1288-1292 
(1989) ) . When the homologously recombinant cell is 
maintained under conditions sufficient for transcription of 
the DNA, the regulatory region introduced by the targeting 
25 construct, as in the case of a promoter, will activate 
expression of the novel transcription unit produced by 
homologous recombination. 

The DNA constructs may be introduced into cells by a 
variety of physical or chemical methods, including 
30 electroporation, microinjection, microprojectile 
bombardment, calcium phosphate precipitation, and 
liposome-, polybrene- , or DEAE dextran-mediated 
transfection . 
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The Targe ted Gene and Resulting Product 
The targeting DNA construct, when introduced by 
homologous recombination or targeting into cells containing 
the TPO, £- interferon, or DNase I gene, produces a novel 
5 transcription unit which results in the expression of TPO, 
IS- interferon, or DNase I. 

At the targeted site in the genome, the exogenous 
regulatory sequence is operatively linked to a CAP site, 
which initiates transcription. operatively linked is 

10 defined as a configuration in which the exogenous 
regulatory sequence, exon, splice-donor site and, 
optionally, an intron sequence and splice -acceptor site, 
are appropriately targeted at a position relative to the 
endogenous gene such that the regulatory element directs 

15 the production of a primary RNA transcript which initiates 
at a CAP site and includes sequences corresponding to the 
exogenous exon or exons and endogenous exons the TPO, DNase 
X, or S-interferon gene. In an operatively linked 
configuration the splice-donor site of the targeting 

20 construct directs a splicing event between an exogenous 
exon and the splice-acceptor site of an endogenous exon, 
such that a desired protein can be produced from the fully 
spliced mature transcript. In one embodiment, the 
splice-acceptor site is endogenous, such that the splicing 

25 event is directed to an endogenous exon of the TPO or DNase 
I gene. In another embodiment an intron and a splice - 
acceptor site are included in the targeting construct used 
to activate the £-interferon gene, and a splicing event 
removes the intron introduced by the targeting construct . 



3 0 EL. The Homolooouslv Recombinant Cells 

The targeting event results in the insertion of the 
regulatory and structural sequences of the targeting 
construct into a cell's genome, creating a novel 
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transcriptional unit under the control of the exogenous 
regulatory sequences . 

Homologous recombination between the. genomic DNA and 
the introduced DNA results in a homologously recombinant 
5 cell, which may be a primary, secondary, or immortalized 
human or other mammalian cell in which sequences which 
alter the expression of an endogenous gene are operatively 
linked to the endogenous TPO, DNaee I, or &- inter feron 
gene. Particularly, the invention includes a homologously 
10 recombinant cell comprising exogenous regulatory sequences 
and an exon, flanked by a splice-donor site, which are 
introduced at a predetermined site by a targeting DNA 
construct, and are operatively linked to the coding region 
of the endogenous gene. Optionally, there may be multiple 
15 exogenous exons (coding or non-coding) and introns 

operatively linked to any exon of the endogenous gene. The 
resulting homologously recombinant cells are cultured under 
conditions which select for amplification, if appropriate, 
of the DNA encoding the amplifiable marker and the novel 
20 transcriptional unit. With or without amplification, cells 
produced by this method can be cultured under conditions, 
as are known in the art, suitable for the expression of 
TPO, £- interferon, or DNase I. 

The targeting constructs and methods of the present 
25 invention may be used with, for example, primary or 

secondary cell strains (which exhibit a finite number of 
mean population doublings in culture and are not 
immortalized) and immortalized cell lines (which exhibit an 
- apparently unlimited lifespan in culture) . Primary and 
30 secondary cells include, for example, fibroblasts, 

keratinocytes, epithelial cells (e.g., mammary epithelial 
cells, intestinal epithelial cells), endothelial cells, 
glial cells, neural cells, formed elements of the blood 
(e.g., lymphocytes, bone marrow cells), muscle cells and 
35 precursors of these somatic cell types. Where the 
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homologously recombinant cells are to be used in gene 
therapy, primary cells are preferably obtained from the 
individual to whom the resulting homologously recombinant 
cells are administered. However, primary cells can be 
5 obtained from a donor (other than the recipient) of the 
same species. Examples of immortalized human cell lines 
which may be used with the DNA constructs and methods of 
the present invention include, but are not limited to, 
HT1080 cells (ATCC CCL 121) , HeLa cells and derivatives of 
10 HeLa cells (ATCC CCL 2, 2.1 and 2.2), MCF-7 breast cancer 
cells (ATCC BTH 22), K-562 leukemia cells (ATCC CCL 243), 
KB carcinoma cells (ATCC CCL 17) , 2780AD ovarian carcinoma 
cells (Van der Blick, A.M. et al . , Cancer Res, 45:5927-5932 
(1988), Raji cells (ATCC CCL 86), WiDr colon adenocarcinoma 
15 cells (ATCC CCL 218) , SW620 colon adenocarcinoma cells 

(ATCC CCL 227) , Jurkat cells (ATCC TIB 152) , Namalwa cells 
(ATCC CRL 1432) , HL-60 cells (ATCC CCL 240) , Daudi cells 
(ATCC CCL 213) , RPMI 8226 cells (ATCC CCL 155) , U-937 cells 
(ATCC CRL 1593) , Bowes Melanoma cells (ATCC CRL 9607) , 
20 WI-38VA13 subline 2R4 cells (ATCC CLL 75.1), and MOLT-4 
cells (ATCC CRL 1582) , as well as heterohybridoma cells 
produced by fusion of human cells and cells of another 
species. Secondary human fibroblast strains, such as WI-38 
(ATCC CCL 75) and MRC-5 (ATCC CCL 171) may be used. 
25 Further discussion of the types of cells that may be used 
in practicing the methods of the present invention is 
presented in applications U.S. S.N. 08/243,391, U.S.S.N. 
07/985,586, U.S.S.N. 07/789,188, U.S.S.N. 07/911,533, 
U.S.S.N. 07/787,840, PCT/US93 /11704 , and PCT/US92/09627 . 

30 EL. In Vivo Protein Production 

Homologously recombinant cells of the present 
invention in which the expression properties of the 
endogenous TPO, £- Interferon, or DNase I gene are altered 
are useful in gene therapy, as populations of homologously 



WO 96/29411 PCT/US96/03377 

-30- 

recornbinant cell lines, as populations of homologously 
recombinant primary or s condary cells, homologously 
recombinant clonal cell strains or lines, homologously 
recombinant heterogenous cell strains or lines, and as cell 
5 mixtures in which at least one representative cell of one 
of the preceding categories of homologously recombinant 
cells is present. Homologously recombinant primary cells, 
clonal cell strains or heterogenous cell strains are 
administered to an individual in whom the abnormal or 

10 undesirable condition is to be treated or prevented, in 

sufficient quantity and by an appropriate route, to express 
or make available the desired product at physiologically 
relevant levels- A physiologically relevant level is one 
which either approximates the level at which the product is 

15 normally produced in the body or results in improvement of 
the abnormal or undesirable condition. Methods for gene 
therapy in which homologously recombinant cells are 
introduced into an individual for the purpose of in vivo 
protein production are described in pending applications 

20 U.S. S.N. 08/243,391, U.S. S.N. 07/985,586, U.S. S.N. 

07/789,188, U.S. S.N. 07/911,533, U.S.S.N., PCT/US93 /11704 , 
and PCT/US92/09627. 

In one embodiment, the invention relates to a method 
of providing TPO to a mammal introducing homologously 

25 recombinant cells into the mammal in sufficient number to 
produce an effective amount of TPO in the mammal. 

In another embodiment homologously recombinant cells 
expressing DNase I can be administered to the trachea and 
lungs of a cystic fibrosis patient, for the purpose of in 

3 0 vivo secretion of DNase I for the relief of respiratory 
distress . 

In a third embodiment, homologously recombinant cells 
expressing £- interferon may be implanted into a patient 
suffering from multiple sclerosis, for the purpose of in 
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vivo secretion of £- interferon to diminish exacerbations 
associated with the disease. 

F\ In Vitro Protein Production 

Homologously recombinant cells produced according to 
5 this invention can also be used for in vitro production of 
TPO f 15- interferon, or DNase I. The cells are maintained 
under conditions , as are known in the art, which result in 
expression of the protein. Proteins expressed using the 
methods described may be purified from cell lysates or cell 

10 supernatants. Proteins made according to this method can 
be prepared as a pharmaceutically-useful formulation and 
delivered to a human or non-human animal by conventional 
pharmaceutical routes as is known in the art (e.g., oral, 
intravenous, intramuscular, intranasal, intratracheal or 

15 subcutaneous) . As described herein, the homologously 
recombinant cells can be immortalized, primary, or 
secondary human cells. The use of cells from other species 
may be desirable in cases where the non-human cells are 
advantageous for protein production purposes where the 

20 non-human TPO, DNase I, or fc- interferon produced is useful 
therapeutically . 

G. Advantages 

The methodologies, DNA constructs, cells, and 
resulting proteins of the invention herein possess 

25 versatility and many other advantages over processes 

currently employed within the art in gene targeting. The 
ability to activate expression of an endogenous TPO, 
jS- interferon, or DNase I gene by positioning an exogenous 
regulatory sequence and other structural sequences at 

30 various positions ranging from directly fused to portions 
of the normal gene's coding region to 3 0 kilobase pairs or 
further upstream of the transcribed region of an endogenous 
gene, or within an intron of an endogenous gene, is 
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advantageous for gene expression in cells. For example, it 
can be employed to position the regulatory element upstream 
or downstream of regions that normally silence or 
negatively regulate a gene. The positioning of a 
5 regulatory element upstream or downstream of such a region 
can override such dominant negative effects that normally 
inhibit transcription. In addition, regions of DNA that 
normally inhibit transcription or have an otherwise 
detrimental effect on the expression of a gene may be 
10 deleted using the targeting constructs, described herein. 
The present invention also allows proteins to be expressed 
in the context of their normal intron sequences, which have 
been shown to be important factors in the expression of 
genes in mammalian cells (cf . Korb. M. et al. Nucl. Acids 
15 Res. 21: 5901-5908 (1993)). 

Additionally, since promoter function is known to 
depend strongly on the local environment, a wide range of 
positions may be explored in order to find those local 
environments optimal for function. However, since, ATG 
20 start codons are found frequently within mammalian DNA 
(approximately one occurrence per 48 base pairs as 
calculated from nearest -neighbor dinucleotide frequencies 
in human DNA) , transcription cannot simply initiate at any 
position upstream of a gene and produce a transcript 
25 containing a long leader sequence preceding the correct ATG 
start codon, since the frequent occurrence of ATG codons in 
such a leader sequence will prevent translation of the 
correct gene product and render the message useless. Thus, 
the incorporation of an exogenous exon, a splice -donor 
30 site, and, optionally, an intron and a splice-acceptor site 
into targeting constructs comprising a regulatory region 
allows gene expression to be optimized by identifying the 
optimal site for regulatory region function, without the 
limitation imposed by needing to avoid inappropriate ATG 
35 start codons in the mRNA produced. This provides 
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signif icantly increased flexibility in the placement of the 
construct and makes it possible to activate a wider range 
of genes than is possible using other technologies- For 
example, U.S. Patent No. 5,272,071 and foreign patent 
5 applications WO 91/06666, WO 91/06667 and WO 90/X1354 

describe homologous recombination methods for inserting a 
regulatory sequence upstream of the coding region of an 
endogenous gene. In these methods, only a very small 
number of positions for promoter insertion are acceptable 

10 for expression, limited by the frequent occurrence of ATG 
start codons as described above. 

The present invention provides further advantages over 
the methods available in the art. For example, the use of 
homologous recombination results in the production of cells 

15 in which the novel transcription unit is present in the 
same location in all cells in which homologous 
recombination has occurred. Thus, the novel transcription 
unit will function similarly in all homologously 
recombinant cells derived independently. This allows for 

20 the production of cells with highly predictable properties. 
In the case of in vitro protein production, it is desirable 
to develop cells in which the behavior (e.g. the expression 
and amplification properties) of the desired gene can be 
controlled and there is little variation when comparing 

25 individual cells which are being processed for large-scale 
production purposes. In the case of in vivo protein 
production or gene therapy, it is desirable to be able to 
develop cells in which the properties are predictable and 
uniform among individual patients. This allows for a high 

30 degree of precision in achieving appropriate levels of the 
desired protein in vivo , leading to controlled and 
reproducible methods for treating disease. 

The DNA constructs described above are useful for 
operatively linking exogenous regulatory and structural 

35 elements to endogenous coding sequences in a way that 
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precisely creates a novel transcriptional unit, provides 
flexibility in the relative positioning of exogenous 
regulatory elem nts and endogenous genes and, ultimately, 
enables a highly controlled system for and regulating 
5 expression of genes of therapeutic interest. 

The subject invention will now be illustrated by the 
following examples, which are not intended to be limiting 
in any way. 

EXAMPLES 

EXAMPLE 1: Cloning of the TP O Gene and Identification of 
5' Flank ing Sequences 
The human thrombopoietin gene was isolated from a 
human genomic DNA library. The library was prepared from 
male leukocyte DNA partially-digested with Mbol and cloned 
into the bacteriophage vector lambda EMBL3 (Clontech, Palo 
Alto, CA; Cat. #HL1006d) . For screening, a probe was 
isolated by PCR amplif ication of human genomic DNA using 
oligonucleotides 1.1 and 1.2. 

Oligo 1.1 (TPO sense) (SEQ ID NO: 1) 

20 5' AATTGCTCCT CGTGGTCATG CTTCT 

Oligo 1.2 (TPO anti-sense) (SEQ ID NO: 2) 
5' CTGTGAAGGA CATGGGAGTC A 

These primers were designed using the known TPO mRNA 
sequence (de Sauvage, F. J. et al. Nature 3*9:533-538 
25 (1994)). The amplified probe (probe A; 120 bp) was labeled 
with 32 P dCTP by the polymerase chain reaction and used to 
screen the genomic DNA library. Filters were hybridized 
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for 6 hours at 68 *C in 125 mM Na 2 HP0 4 (pH 7.2) , 250 mM 
NaCl, 10% PEG 8000, 7% SDS r 1 mM EDTA. Filters were washed 
twice in 500 ml of 20 mM Na 2 HP0 4 , (pH 7.2), 1 mM EDTA, 5% 
SDS, followed by 4 washes in 500 ml of 20 mM Na 2 HP0 4 , (pH 
5 7.2), 1 mM EDTA, 1% SDS . The wash buffers were pre-heated 
to 56 m C and washing was done on a rotary shaker at room 
temperature for approximately 5 minutes per wash. The 
hybridizing signals were identified by autoradiography at 
-80 *C with an intensifying screen. In one experiment, 

10 approximately 1.4 x 10 6 phage were screened and 7 positive 
signals were obtained. Phage plaques corresponding to 
positive signals were plaque purified. Following 2 rounds 
of plaque purification by low density screening using probe 
A, 4 of the phage, designated 5B, 25A, 25B and 28B, were 

15 retained for further analysis. Plaque purified phage were 
amplified and isolated by cesium chloride gradient 
ultracentrifugation (Yamamoto K.R. et al . , Virology 40:734 
(1970) ) and DNA was isolated. Library screening, plaque 
purif ication of recombinant bacteriophage, and isolation 

20 bacteriophage DNA was performed using standard methods 

(Ausubel et al . , Current Protocols In Molecular Biology, 
Wiley, New York, NY. (1987)). 

An approximately 6.9 kb Xbal fragment comprising exon 
1, intron 1, exon 2, intron 2, exon 3, and a portion of 

25 intron 3, as well as approximately 4.3 kb of nontranscribed 
DNA lying upstream of TPO exon 1 was identified by 
restriction enzyme and Southern hybridization analysis 
using probe A. This fragment was isolated from one genomic 
clone (28B) and subcloned into plasmid pBSIISK* (Stratagene 

3 0 Inc., La Jolla, CA) for further analysis. The resultant 

clones, pBS(X)/5'Thromb.8 and pBS (X) /5 ' Thromb . 2 , harbor the 
6 . 9 kb Xbal fragment in opposite orientations with respect 
to the plasmid backbone. Restriction enzyme mapping 
yielded the restriction enzyme map shown in Figure 3. The 

35 nucleotide sequence of the portion of this fragment lying 
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upstream of the 5' end of the known cDNA sequence is shown 
in Figure 4 (SEQ ID NO: 3) . The nucleotide sequence of the 
portion of the 6.9 kb Xbal fragment lying downstream of the 
5' end of the known cDNA sequence is shown in Figure 5 (SEQ 
5 ID NO: 4) . Comparison of the cloned genomic sequence 
presented here with the published cDNA sequence (de 
Sauvage, F. J, et al., Mature 355:533-538 (1994)) reveals 
that the 5' end of the TPO gene consists of a non-coding 
exon (exon 1) of at least 107 bp, a second exon (exon 2) 
10 which is 158 bp, and a third exon (exon 3) which is 128 bp 
in length. The 13 base pairs at the 3' end of exon 2 code 
for the first four and a portion of the fifth amino acid of 
the TPO signal peptide- Exon 3 codes for the remainder of 
the 21 amino acid signal peptide and a portion of the 
15 mature TPO polypeptide. Exons .*1 and 2 are separated by 
intron 1 (1671 bp) , and exons 2 and 3 are separated by 
intron 2 (231 bp) . There are two differences between the 
sequence reported in Figure 5 and the sequence published by 
de Sauvage et al.: nucleotides at positions -134 and -124 
20 are reported as C residues by de Sauvage et al . and are 
shown as T residues in Figure 5. These residues are 
outside of the coding sequence for TPO and may be explained 
by sequence polymorphism or by errors in compilation of the 
published sequence. In any event, this minor difference 
25 does not impact the ability of the person of skill to 
practice the invention as described herein. 

EXAMPLE 2: Construc tion of Targeting Plasmids for 

Activation and Amplifi cation of the TPO Gene 
The activation of the TPO gene can be accomplished by 
3 0 a number of strategies, as shown in Figures 6-8. In the 
strategy shown in Figure 6, a targeting fragment is 
introduced into the genome of recipient cells for insertion 
of a regulatory region, a non- coding exon, and a 
functional, unpaired splice-donor site upstream of the TPO 



WO 96/29411 



PCT/US96/03377 



-37- 

coding region. Specifically, the targeting construct from 
which this fragment is derived (pRTPOl) is designed to 
include a first targeting sequence homologous to sequences 
upstream of the TPO gene, an amplifiable marker gene, a 
5 selectable marker gene, a regulatory region, a CAP site, a 
non-coding exon, an unpaired splice-donor site, and a 
second targeting sequence corresponding to sequences 
downstream of the first targeting sequence but upstream of 
TPO exon l. By this strategy, homologously recombinant 

10 cells produce an mRNA precursor which includes the 

non- coding exon introduced upstream of the TPO gene by 
homologous recombination, the second targeting sequence and 
any sequences between the second targeting sequence and 
exon 2 of the TPO gene, and the remaining exons, introns, 

15 and 3' untranslated regions of the TPO gene (Figure 6) . 
Splicing of this message results in the fusion of the 
exogenous non- coding exon to exon 2 of the endogenous TPO 
gene which, when translated, will produce TPO. In this 
strategy the first and second targeting sequences are 

20 upstream of the normal target gene, but this is not 
required (see below) . The size of the intron in the 
targeting construct and thus the position of the regulatory 
region relative to the coding region of the gene may be 
varied to optimize the function of the regulatory region. 

25 Plasmid pRTPOl is constructed as follows: Based on the 

restriction map of the TPO upstream region (Figure 3) , a 
3.5 kb BaMII fragment can be isolated from subclone 
pBS (X) /5 'Thromb. 8 (Example 1). This fragment is ligated to 
BamHI digested plasmid pBS (Stratagene, Inc., La Jolla, CA) 

30 and transformed into competent E. coll cells to generate 

pBS-TPOl. This fragment includes sequences lying upstream 
of TPO exon 1. Next, a 0.73 kb fragment was amplified from 
hGH expression construct pXGH308 f which has the CMV 
immediate-early (IE) gene promoter region beginning at 

35 nucleotide 546 and ending at nucleotide 2105 of Genbank 
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sequence HS5MIEP fused to the hGH sequences beginning at 
nucleotide 5225 and ending at nucleotide 7322 of Genbank 
sequence HUMGHCSA, using oligonucleotides 2.1 and 2.2. 
(The source of the CMV IE gene is not critical, and other 
5 CMV IE promoter- based plasmids may be used, or wild-type 
CMV DNA may be used.) Oligo 2.1 (37 bp, SEQ ID NO: 5), 
hybridizes to the CMV IE promoter at -614 relative to the 
cap site (in Genbank sequence HEHCMVP1) , and includes a 
Not! site followed by a partially overlapping Xhal site at 

10 its 5' end. Oligo 2.2 (36 bp, SEQ ID NO: 6), hybridizes to 
the CMV IE promoter at +131 relative to the cap site and 
includes the first 10 base pairs of the first intron of the 
CMV IE gene and contains a NotI site at its 5' end. The 
resulting PCR fragment is digested with NotI and 

15 gel-purified. Plasmid pBS-TPOl is digested with JtotI, 
which cleaves at a single site upstream of TPO exon 1 
(Figure 3), and the digested DNA is ligated to the CMV 
promoter fragment prepared above and transformed into 
competent E. coll cells. Colonies containing inserts of 

20 the CMV promoter inserted at the No tl site of pBS-TPOl are 
analyzed by restriction enzyme analysis to confirm the 
orientation of the insert, and one recombinant plasmid in 
which the CMV promoter is oriented such that the direction 
of transcription is towards TPO exon 1 is identified and 

25 designated pBS-TP02. 

Oligo 2.1 (SEQ ID NO: 5) 

5 . TTTT nrflflCC GCTCGAGG AC ATTGATTATT GACTAGT 
NotI Xhol 

Oligo 2.2 (SEQ ID NO: 6) 



3 0 5' TTTTSCG£C£_GCCGGTACTT ACGTCACTCT TGGCAC 
NotI 
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Next, the neomycin phosphotransferase (neo) gene is 
inserted into pBS-TP02 for use as a selectable marker in 
isolating stably transfected human cells. Plasmid 
pMClneoPolyA [Thomas, K.R. and Capecchi, M.R. Cell 
5 51:503-512 (1987); available f rom Stratagene Inc., La 

Jolla, CA] is digested with BairiHI and made blunt -ended by 
treatment with the Klenow fragment of E. coll DNA 
polymerase. The treated DNA is then ligated to a 
double- stranded 10 base pair Clal linker of the sequence 

10 5 ' GGATCGATCC , chosen such that the BairiHI site is not 

regenerated by the linker addition. The resulting DNA is 
digested with Clal and the digested DNA is ligated under 
dilute conditions to promote recircularization and 
transformed into competent E. coli cells. Transformed 

15 colonies are analyzed by restriction enzyme digestion to 
identify cells containing a derivative of plasmid 
pMClneoPolyA with an insertion of a Clal site at the 3' end 
of the neo gene. This plasmid is designated pMClneo-C. 
pMClneo-C is digested with Xhol and Sail and the 

20 approximately 1.1 kb fragment containing the neo 

expression unit is gel purified. Plasmid pBS-TP02 is 
digested at the unique Xhol site which was introduced by 
PCR at the 5' end of the CMV promoter, and the digested DNA 
is ligated to the purified Xhol-Sall fragment containing 

25 the neo gene and transformed into competent E . coli cells. 
Colonies containing inserts of the neo gene inserted at the 
Xhol site of pBS-TP02 are analyzed by restriction enzyme 
analysis to confirm the orientation of the insert, and one 
recombinant plasmid in which the neo gene is oriented such 

30 that the direction of transcription is opposite to CMV is 
identified and designated pBS-TP03. 

Finally, the targeting construct pTPOl is constructed 
by insertion of a dhfr expression unit (to select for 
amplification in targeted human cells) at the Clal site 

35 located at the 5' end of the neo gene of pBS-TP03 . To 
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obtain a dhfr expr ssion unit, the plasmid construct 
pF8CIS9080 [Eaton et al., Biochemistry 25: 8343-8347 
(1986)] is digested with EcoRI and Sail. A 2 Jcb fragment 
containing the dhfr expression unit is purified from this 
5 digest and made blunt by treatment with the Klenow fragment 
of DNA polymerase I. A Clal linker (New England Biolabs, 
Beverly, MA) is then ligated to the blunted dhfr fragment. 
The products of this ligation are digested with Clal 
ligated to Clal digested pBS-TP03 . An aliquot of this 
10 ligation is transformed into E. coli and plated on 
ampicillin selection plates. Bacterial colonies are 
analyzed by restriction enzyme digestion to determine the 
orientation of the inserted dhfr fragment. One plasmid 
with dhfr in a transcriptional orientation opposite that of 
15 the neo gene is designated pRTPOl. For targeting to the 

TPO locus in cultured human cells, pRTPOl is digested with 
Banffll to separate the targeting fragment containing the 
targeting DNA, neo gene, dhfr gene, CMV promoter, and 
splice-donor site from the pBS plasmid backbone. 
20 A second strategy for activation of the TPO gene is 

shown in Figure 7. In this strategy, a targeting fragment 
is introduced into the genome of recipient cells for 
insertion of a regulatory region, a non-coding exon, a 
splice-donor site, an intron, a splice-acceptor site, a 
25 second non-coding exon, and a functional, unpaired 
splice-donor site upstream of the TPO coding region. 
Specifically, the targeting construct from which this 
fragment is derived (pRTP02) is designed to include a first 
targeting sequence homologous to sequences upstream of the 
30 TPO gene, an amplifiable marker gene, a selectable marker 
gene, a regulatory region, a CAP site, a non-coding exon, a 
splice-donor site, an intron, a splice-acceptor site, a 
second non-coding exon, an unpaired splice-donor site, and 
a second targeting sequence corresponding to sequences 
35 downstream of the first targeting sequence but upstream of 
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TPO exon 2. By this strategy, homologously recombinant 
cells produce an tnRNA precursor which corresponds to the 
first and second non-coding exogenous exons separated by an 
intron, the second targeting sequence, any sequences 
5 between the second targeting sequence and exon 2 of the TPO 
gene, and the remaining exons, introns, and 3' untranslated 
regions of the TPO gene (Figure 7) . Splicing of this 
message results in the fusion of the second non-coding 
exogenous exon to exon 2 of the endogenous TPO gene which, 

10 when translated, will produce TPO. In this strategy the 
first and second targeting sequences are upstream of the 
normal target gene, but this is not required (see below) . 
The size of the intron in the targeting construct and thus 
the position of the regulatory region relative to the 

15 coding region of the gene may iie varied to optimize the 
function of the regulatory region. 

Plasmid pRTP02 is constructed as follows: Based on 
the restriction map of the TPO upstream region (Figure 3), 
a i,8 kb BaiaHI-KcoRI fragment can be isolated from subclone 

20 pBS(X)/5'Thromb.8 (Example 1). This fragment is ligated to 
BairiHI and EcoRI digested plasmid pBS (Stratagene, Inc., La 
Jolla, CA) and transformed into competent E . coll cells to 
generate pBS-TP04. This fragment includes TPO exon 1 but 
contains no TPO coding sequences. 

25 Next, oligonucleotides 2.3 to 2.6 are used in PCR to 

fuse CMV IE promoter sequences beginning at nucleotide 54 6 
and ending at nucleotide 2105 of Genbank sequence HS5MIEP 
to sequences from the TPO gene comprised of exon 1 and a 
portion of intron 1. The properties of these primers are 

30 as follows: 2.3 (SEQ ID NO: 7) is a 30 base 

oligonucleotide homologous to a segment of the CMV IE 
promoter beginning at nucleotide 546 of Genbank sequence 
HS5MIEP (-614 relative to the cap site) and includes a Xhol 
site at its 5' end; 2.4 (SEQ ID NO: 8) and 2.5 (SEQ ID NO: 

3 5 9) are 6 0 nucleotide complementary primers which define the 
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fusion of CMV (position 2100 of Genbank s quence HS5MIEP) 
and TPO (position -1881 relative to the TPO translation 
start site) sequences; 2.6 (SEQ ID NO: 10) is 27 
nucleotides in length and is homologous to TPO sequences 
ending in TPO intron 1 at position -1374 relative to the 
TPO translation start site and includes a natural Apal 
site. 

Oligo 2.3 (SEQ ID NO: 7) 

5' TTTT CTCGAG GACATTGATT ATTGACTAGT 
Xhol 

Oligo 2.4 (SEQ ID NO: 8) 

5' catgggtctt ttctgcagtc accgtccttg CTACCCATCT GCTCCCCAGA 
GGGCTGCCTG 

Oligo 2.5 (SEQ ID NO: 9) 

5' CAGGCAGCCC TCTGGGGAGC AGATGGGTAG caaggacggt gactgcagaa 
aagacccatg 

Oligo 2.6 (SEQ ID NO: 10) 

5' TTTTGGGCCC TCCTCCCATT ACCCTCT 
Apal 

Oligos 2.3-2.6: Bases in lower-case type denote CMV 
sequences; bases in upper-case type denote TPO sequences 



These primers are used to amplify a 2.1 kb DNA 
fragment comprising a fusion of CMV IE and TPO sequences. 
The fusion fragment is created by first using oligos 2.3 
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and 2.4 to amplify a 1.6 kb fragment from hGH expression 
construct pXGH308, which has the CMV immediate -early (IE) 
gene promoter region beginning at nucleotide 546 and ending 
at nucleotide 2105 of Genbank sequence HS5MIEP fused to the 
5 hGH sequences beginning at nucleotide 5225 and ending at 
nucleotide 7322 of Genbank sequence HUMGHCSA. (The source 
of the CMV IE gene is not critical, and other CMV IE 
promoter-based plasmids may be used, or wild- type CMV DNA 
may be used.) Then, oligos 2.5 and 2.6 are used to amplify 
10 a 0.54 kb fragment containing portions of TPO exon 1 and 
TPO intron 1 from plasmid pBS (X) /5'Thromb. 8 (Example 1) . 
The two amplified fragments are then combined and further 
amplified using oligos 2.3 and 2.6. The resulting product, 
a 2.1 kb PCR fragment is digested with Xhol and Apal and 
15 gel purified. Plasmid pMCneo-d (see above) is digested 
with Sail and Xhol and the 1.1 kb neo containing fragment 
is gel purified. The purified 2.1 kb PCR fragment and the 
1.1 kb neo fragment are then mixed and ligated to pBS-TP04 
(above) which has been cut with Sail and Apal. The 
20 ligation mixture is transformed into E. coli cells and a 
plasmid with a single insert of each the fusion fragment 
and the neo gene is identified, this plasmid having the 
Sail site at the 3' end of the neo gene regenerated by 
ligation to the Sail site in the polylinker of pBS-TP04 . 
25 The resulting plasmid is designated pBS-TP05. 

A dhfr expression unit (to select for amplification in 
targeted human cells) is then inserted at the Clal site 
located at the 5' end of the neo gene of pBS-TPOS. The 
dhfr expression unit is isolated from plasmid pF8CIS9080 
30 [Eaton et al . , Biochemistry 25 : 8343-8347 (1986)] by 

digestion with EcoRI and Sail. A 2 kb fragment containing 
the dhfr expression unit is purified from this digest and 
made blunt by treatment with the Klenow fragment of DNA 
polymerase I. A Clal linker (New England Biolabs, Beverly, 
3 5 MA) is then ligated to the blunted dhfr fragment. The 
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products of this ligation are digested with Clal ligated to 
Clal digested pBS-TP05. An aliquot of this ligation is 
transformed into E. coll and plated on ampicillin selection 
plates. Bacterial colonies are analyzed by restriction 
5 enzyme digestion to determine the orientation of the 
inserted dhfr fragment. One plasmid with dhfr in a 
transcriptional orientation opposite that of the neo gene 
is designated pBS-TP06. 

To complete plasmid pRTP02, plasmid pBS (X) /5 ' Thromb . 8 
10 (Example 1) is partially digested with BairiHI and ligated to 
a Sail linker. The resulting DNA is then digested with 
Sail and Hindlll and the 3.7 kb fragment consisting of 
sequences upstream of the TPO gene is isolated for use as a 
second targeting sequence. This fragment is ligated to 
15 Hindlll-Sall digested pBS-TP06 to generate the targeting 

plasmid pRTP02. For targeting to the TPO locus in cultured 
human cells, pRTP02 is digested with Hindi 1 1 and EcoRl to 
separate the targeting fragment containing the targeting 
DNA, neo gene, dhfr gene, and CMV promoter from the pBS 
20 plasmid backbone. 

A third strategy for activation of the TPO gene is 
shown in Figure 8. In this strategy, a targeting fragment 
is introduced into the genome of recipient cells for 
replacement of the normal TPO regulatory region, TPO exon 
25 1, TPO intron 1, and TPO exon 2 with an exogenous 
regulatory region, a coding exon, and a functional, 
unpaired splice-donor site. Specifically, the targeting 
construct from which this fragment is derived (pRTP03) is 
designed to include a first targeting sequence homologous 
3 0 to sequences upstream of the TPO gene, an amplifiable 

marker gene, a selectable marker gene, a regulatory region, 
a CAP site, an exon which includes sequences coding for the 
first 3 1/3 amino acids of the human growth hormone (hGH) 
signal peptide, an unpaired splice-donor site, and a second 
3 5 targeting sequence corresponding to TPO intron 2 sequences. 
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By this strategy, hoinologously recombinant cells produce an 
mRNA precursor which corresponds to the exogenous coding 
exon, intron 2 of the TPO gene, exon 3 of the TPO gene, and 
the remaining exons, introns, and 3' untranslated regions 
5 of the TPO gene (Figure 8) • Splicing of this message 

results in the fusion of the exogenous coding exon to exon 
3 of the endogenous TPO gene which, when translated, will 
produce a fusion protein in which the first 3 amino acids 
of the signal peptide are derived from hGH. The signal 

10 peptide of this molecule is cleaved off prior to secretion 
from a cell to produce mature TPO. In this strategy the 
first targeting sequence is upstream of the normal target 
gene, while the second targeting sequence is within the 
gene, between exons 2 and 3. The position of the first 

15 targeting sequence and the amount of upstream DNA replaced 
or deleted by the targeting event may be varied to optimize 
the function of the regulatory region. 

Plasmid pRTP03 is constructed as follows: 
Oligonucleotides 2.8 to 2.11 are used in PCR to fuse CMV IE 

20 promoter sequences beginning at nucleotide 54 6 and ending 
at nucleotide 1258 of Genbank sequence HS5MIEP to sequences 
from the human growth hormone gene which encode the first 3 
1/3 amino acids of the hGH signal peptide, a splice donor 
site, and the second intron of the TPO gene. The 

25 properties of these primers are as follows: Oligo 2.8 (SEQ 
ID NO: 11) is a 30 base oligonucleotide homologous to a 
segment of the CMV IE promoter beginning at nucleotide 54 6 
of Genbank sequence HS5MIEP (-614 relative to the cap site) 
and includes an Xhol site at its 5' end; 2.9 (SEQ ID NO: 

3 0 12) and 2.10 (SEQ ID NO: 13) are 69 nucleotide 

complementary primers which define the fusion of CMV 
(position 2100 of Genbank sequence HS5MIEP) and hGH 
sequences (position -10 relative to the translation start 
site of the hGH gene; see the hGH gene N sequence in 

35 Genbank entry HUMGHCSA) sequences. These primers also 
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include the first 29 base pairs of TPO intron 2 
(nucleotides +14 to +42 relative to the TPO translation 
start site), which include the splice donor site; 2.11 (SEQ 
ID NO: 14) is 45 nucleotides in length and is homologous to 
5 TPO sequences in TPO intron 2 starting at position +182 
relative to the TPO translation start site and extending 
upstream, and includes a natural EcoRI site at its 5' end. 

The fusion fragment is created by first using oligos 
2.8 and 2 - 9 to amplify a 0.7 kb fragment from CMV viral DNA 
10 containing a wild-type immediate early gene and promoter 
sequence. (The source of the CMV IE gene is not critical, 
and other CMV IE promoter-based plasmids may be used.) 
Then, oligos 2.10 and 2.11 are used to amplify a 0.17 kb 
fragment containing a portion of TPO intron 2 from plasmid 
15 P BS(X)/5'Thromb.8 (Example 1). : The two amplified fragments 
are then combined and further amplified using oligos 2.8 
and 2.11. The resulting product, a 0.9 kb PCR fragment is 
digested with Xhol and EcdRl and gel purified. Next, 
plasmid a pBS (X) /5'Thromb.8 (Example 1) is partially 
20 digested with BairiHI and ligated to an Xhol linker. The 
resulting DNA is then digested with Xhol and Hindlll and 
the 3.9 kb fragment consisting of sequences upstream of the 
TPO gene is isolated for use as a second targeting 
sequence. This fragment contains sequences from -5985 to 
25 -2095 relative to the TPO translation start site (Figure 
3) . The isolated fragment is then ligated in a mixture 
containing the 0.9 kb fusion fragment purified above and 
Hindlll and EcoRI digested plasmid pBS (Stratagene, Inc., 
La Jolla, CA) and transformed into competent E. coli cells 
30 to generate pBS-TP07 . 

For insertion of the neo selectable marker gene, 
plasmid pMClneo-C (see above) is digested with Xhol and 
Sail and ligated to Xhol digested pBS-TP07. The ligation 
mix is transformed into E. coli cells and colonies are 
35 analyzed by restriction enzyme analysis to identify a 
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plasmid with a single insert of the neo gene oriented such 
that the direction of transcription is opposite to that of 
the CMV promoter. This plasmid is designated pBS-TP08. 

A dhfr expression unit (to select for amplification in 
5 targeted human cells) is then inserted at the Clal site 
located at the 5' end of the neo gene of pBS-TPOS. The 
dhfr expression unit is isolated from plasmid pF8CIS9080 
[Eaton et al.. Biochemistry 25: 8343-8347 (1986)] by 
digestion with BcoRI and Sail. A 2 kb fragment containing 

10 the dhfr expression unit is purified from this digest and 
made blunt by treatment with the Klenow fragment of DNA 
polymerase I. A Clal linker (New England Biolabs, Beverly, 
MA) is then ligated to the blunted dhfr fragment. The 
products of this ligation are digested with Clal ligated to 

15 Clal digested pBS-TPOS . An aliquot of this ligation is 

transformed into E. coll and plated on ampicillin selection 
plates. Bacterial colonies are analyzed by restriction 
enzyme digestion to determine the orientation of. the 
inserted dhfr fragment* One plasmid with dhfr in a 

2 0 transcriptional orientation opposite that of the neo gene 

is designated pRTP03 . For targeting to the TPO locus in 
cultured human cells, pRTP03 is digested with EcoRI and 
Hindi 1 1 to separate the targeting fragment containing the 
targeting DNA, neo gene, dhfr gene, CMV promoter, and hGH 
25 coding DNA from the pBS plasmid backbone. 

Oligo 2.8 (SEQ ID NO: 11) 

5' TTTTCTCGAG GACATTGATT ATTGACTAGT 
Xhol 

Oligo 2.9 (SEQ ID NO: 12) 

3 0 5' cgcggattcc ccgtgccaag CCTAGCGGCA ATGGCTACAG GTGAGAACAC 

ACCTGAGGGG CTAGGGCCA 
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Oligo 2.10 (SEQ ID NO: 13) 

5' TGGCCCTAGC CCCTCAGGTG TGTTCTCACC TGTAGCCATT GCCGCTAGGc 
ttggcacggg gaatccgcg 

Oligo 2.11 (SEQ ID NO: 14) 
5 5' TTT TGAATTC CCATTCAGGA CCCAGACCTG AAACCCAGGG AATCC 
EcdRI 

Oligos 2.8-2.11: Bases in lower-case type denote CMV 
sequences; upper-case, non-bold bases denote TPO sequences; 
boldface bases denote hGH exon 1 sequences . 

10 Other approaches for targeting and activation of the 

TPO gene may be employed. For example, the first and 
second targeting sequences may ^correspond to sequences in 
the first or second intron of the TPO gene, and the 
targeting sequences may include TPO coding sequences. In 
15 any activation strategy, the second targeting sequence does 
not need to lie immediately adjacent to or near the first 
targeting sequence in the normal gene, such that portions 
of the gene's normal upstream region are deleted upon 
homologous recombination. Furthermore, one targeting 
20 sequence may be upstream of the gene and one may be within 
an exon or intron of the TPO gene. 

A selectable marker gene is optional and the 
amplifiable marker gene is only required when amplification 
is desired. The amplifiable marker gene and selectable 
25 marker gene may be the same gene, their positions may be 
reversed, and one or both may be situated in the intron of 
the targeting construct. Amplifiable marker genes and 
selectable marker genes suitable for selection are 
described herein. The incorporation of a specific CAP site 
30 is optional. The regulatory region, CAP site, first 
non- coding exon, splice -donor site, intron, second 
non-coding exon, and splice acceptor site may be isolated 
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as a complete unit from the human elongation factor- la 
(EF-la; Genbank sequence HUMEF1A) gene or the 
cytomegalovirus (CMV; Genbank sequence HEHCMVP1) immediate 
early region, or the components can be assembled from 
5 appropriate components isolated from different genes. In 
any case, either exogenous exon may be the same or 
different from the first exon of the normal TPO gene, and 
multiple non-coding exons may be present in the targeting 
construct . 

10 As described herein, a number of selectable and 

amplifiable markers may be used in the targeting 
constructs/ and the activation may be effected in a large 
number of cell -types. 

EXAMPLE 3: m Vitro Production of TPO by Activation and 

15 Amplification of the TPO Gene in an 

Immortalized Cell Line 
Transf ection of primary, secondary, or immortalized 
human cells and isolation of homologously recombinant cells 
expressing TPO may be accomplished using the methods 

20 described in U.S. Serial No. 08/243,391 incorporated by 
reference. Homologously recombinant cells may be 
identified by PCR screening strategy as exemplified therein 
and in published methods available to one skilled in the 
art (see, for example, Kim, H-S and Smithies, O. , Nu cl . 

25 Acids Res. 16:8887-8903 (1988)). The identification of 
cells expressing TPO may also be accomplished using a 
variety of assays based on the structure or properties of 
TPO. For example, TPO may be functionally identified by an 
in vitro or in vivo megakaryocytopoiesis assay (de Sauvage 

30 et al., Nature 369:533-538 (1994)). Alternatively, TPO may 
be assayed by the stimulation of proliferation of cells 
expressing the c-mpl ligand, the receptor for TPO. In this 
assay, cells such as Ba/F3-mpl cells (de Sauvage et al . , 
Nature 369:533-538 (1994)), are exposed to TPO and cell 
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prol iteration is monitored by 3 H-thymidine uptake. TPO may 
also be assayed through its effects on An vivo platelet 
production, either by direct platelet counts or by 
incorporation of 35 S into platelets. Finally, peptides 
5 corresponding to portions of the TPO molecule may be 

synthesized in order to generate anti-TPO antibodies for 
use in an EL ISA assay. 

The isolation of cells containing amplified copies of 
the amplifiable marker gene and the activated TPO locus is 
10 performed as described in U.S. Serial No.: 07/985,586 
incorporated by reference. 

EXAMPLE 4: rionina o f the Human DNaee I Gene and 

ratificati on of the 5' Flanking Sequences 

The human DNase I gene was isolated from a human 
15 genomic DNA library. The library (Clontech, Palo Alto, CA; 
Cat. #HL1006d) was constructed by cloning Mbol partially 
digested male leukocyte DNA into the BaxriHI site of the 
bacteriophage lambda vector EMBL3 . For library screening, 
a DNA probe was isolated by PCR amplification of human 
20 genomic DNA using oligonucleotides 4.1 and 4.2. 

Oligo 4.1 (SEQ ID NO: 15) 

5 , TGCCTTGAAG TGCTTCTTCA 

Oligo 4.2 (SEQ ID NO: 16) 

5' CCTCAGAGAT GACGAGAATG C 

25 These primers were designed based on the published 

DNase I mRNA sequence (Shak S. et al . , Proc. Natl. Acad. 
Sci. USA 87: 9188-9192 (1990)). The amplified probe (probe 
A; 126 bp) was labeled with 32 P-dCTP by PCR and used to 
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screen a bacteriophage lambda genomic DNA library. The 
filters were hybridized for 16 hours at 68 *C in 125 mM 
Na 2 HP0 4 <pH 7.2), 250 mM NaCl, 10% PEG 8000, 7% SDS, 1 mM 
EDTA. Filters were washed two times in 500 ml of 20 mM 
5 Na 2 HP0 4 (pH 7.2), 5% SDS, 1 mM EDTA, followed by 4 washes 
in 500 ml of 20 mM Na 2 HP0 4 (pH 7.2), 1% SDS, 1 mM EDTA. 
The wash buffers were preheated to 56 *C and washing was 
performed at room temperature on a rotary shaker for 
approximately 5 minutes per wash. The hybridization 

10 signals were visualized by autoradiography at -80 *C with an 
intensifying screen. In this experiment, approximately 1 x 
10 6 phage were screened and 18 positive signals were 
obtained. Bacteriophage plaques corresponding to 10 of the 
positive signals were plated at low density and subjected 

15 to a second round of screening using probe A. Four of the 
phage (designated 2a, 3b, 4c and 14a) gave positive 
hybridization signals following the secondary screening and 
were retained for further analysis. DNA was isolated from 
the plaque purified phage following amplification and 

20 subsequent purification by cesium chloride gradient ultra 
centrifugation (Yamamoto, K.R. et al., Virology 40:734 
(1970) ) . Library screening, plaque purification of 
recombinant bacteriophage and isolation of bacteriophage 
DNA was performed using standard methods (Ausubel et al., 

25 Cur-rent Protocols in Molecular Biology. Wiley, New York, NY 
(1987) ) . 

Based on restriction enzyme digestion and Southern 
blot analysis using probe A, two of the phage (4c and 14a) 
contain a common Hindi fragment of approximately 8 kb 

3 0 which encompasses exon 1, intron 1, exon 2, coding and 
non-coding sequences corresponding to intron 2 and 
downstream DNase I exons, as well as approximately 4 kb of 
non- transcribed DNA lying upstream of DNase J exon I. This 
fragment was isolated from one genomic clone (4c) and 

35 subcloned into pBSIISK* (Stratagene Inc., La Jolla, CA) for 
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further analysis. Restriction enzyme mapping of the 
resultant clone, pBS/ 4C.2Hinc2, was used to generate the 
restriction map shown in Figure 9. The nucleotide sequence 
of the non- transcribed DNase I 5' region lying upstream of 
5 the 5' end of the known cDNA sequence is shown in Figure 10 
(SEQ ID NO: 17) . The nucleotide sequence lying downstream 
of the 5' end of the known cDNA sequence, including exon 1, 
intron 1 and part of exon 2 is shown in Figure 11 (SEQ ID 
NO: 18) . Comparison of the cloned genomic sequence 

10 presented here, with the published cDNA sequence (Shak, S. 
et ml., Proc. Natl. Acad. Sc±. USA 57:9188-9192 (1990)) 
reveals that the 5' end of the DNase I gene consists of a 
non-coding exon (exon 1) of 142 bp and a second exon (exon 
2) which is at least 341 bp. Exon 2 encodes a 22 amino 

15 acid signal sequence and a portion of the mature DNase I 
peptide, beginning with an AUG translational initiation 
codon which lies 1 bp downstream of the 5' end of exon 2. 
Exons 1 and 2 are separated by intron 1 which is 336 bp in 
length. 

20 EXAMPLE 5: Construc tion of Targeting Plasmids for 

Agtivation and Amplification of the DNase I 
Gene 

The activation of the DNase I gene can be accomplished 
by the strategy outlined in Figure 12. In this strategy, a 

25 targeting fragment is introduced into the genome of 

recipient cells for insertion of a regulatory region, a 
non-coding exon and a functional unpaired splice-donor site 
upstream of the DNase J coding region. Specifically, the 
targeting construct from which this fragment is derived 

30 (pDNasel) , is designed to include a 5' targeting sequence 
homologous to sequences upstream of the DNase I gene, a 
selectable marker gene, an amplifiable marker gene, a 
regulatory region, a CAP site, a non-coding exon, an 
unpaired splice-donor site, and a 3' targeting sequence 
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corresponding to sequences downstream of the 5' targeting 
sequence but upstream of DNaee I exon 1. According to this 
strategy, integration of the targeting construct by 
homologous recombination generates recombinant cells 
5 producing an mRNA precursor which includes the non-coding 
exon introduced upstream of the DNase I gene, the 3' 
targeting sequence, any sequences between the 3' targeting 
sequence and exon 2 of the DNase I gene, and the remaining 
exons, introns and 3' untranslated regions of the DNase I 

10 gene (Figure 12) . Splicing of this transcript results in 
the fusion of the exogenous non-coding exon to exon 2 of 
the endogenous DNaee I gene. DNase I is produced by 
translation of the mature mRNA. According to this 
strategy, both the 5' and 3' targeting sequences are 

15 upstream of the endogenous target gene. The size of the 
chimeric intron in the targeting construct, which is 
dictated by the position of the regulatory region relative 
to the coding sequence, may be varied to optimize the 
function of the regulatory region. 

20 Plasmid pCNDl, which contains the activation cassette, 

is constructed as follows: A 1555 bp (size includes a 9 bp 
synthetic Hindlll recognition site at the 5' end of oligo 
5.2) fragment is amplified using oligos 5.1 and 5.2. The 
amplified fragment encompasses the CMV IE promoter, CMV IE 

25 exon 1 (non-coding exon) and 827 bp of CMV IE intron 1, 
beginning at nucleotide 172,783 and ending at nucleotide 
174,328 of EMBL sequence XI 7403 ((Human cytomegalovirus 
strain AD169) . (The source of the CMV IE gene is not 
critical, and CMV IE promoter -based plasmids or wild- type 

30 CMV DNA may be used.) Oligo 5.1 (21 bp, SEQ ID NO: 19) 

hybridizes to the CMV IE promoter at -598 relative to the 
CAP site (EMBL sequence X17403) . Oligo 5.2 (32 bp, SEQ ID 
NO: 20) contains 23 nucleotides which hybridize to the CMV 
IE promoter at +946 relative to the CAP site, the 

35 additional 9 bp at the 5' end of the oligo create a 
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synthetic Hindlll recognition sequence. The 1555 bp PCR 
product is digested with Hindlll and the resultant 1551 bp 
fragment is purified and used in the ligation described 
below. Next, the neomycin phosphotransferase (neo) gene is 
5 isolated from plasmid pBSneo for use as a selectable marker 
for the isolation of stably transfected human cells. The 
neo gene in plasmid pBSneo was obtained by BauHl and Xhol 
digestion of pMClneo-polyA (Thomas, K.R. and Capecchi , M.R. 
Cell 51:503-512 (1987)). Plasmid pMClneo-polyA was 
10 digested with BamHI and made blunt ended with the Klenow 
fragment of E. coli DNA polymerase I. The resulting DNA 
was digested with Xhol, and the blunt-ended BairiHI-XhoI 
fragment was cloned into Hindi and Xhol digested plasmid 
pBSIISK*. For isolation of the neo gene harbored on 
15 pBSneo, plasmid pBSneo is digested with Xhol and made 
blunt -ended by treatment with the Klenow fragment of E. 
coli DNA polymerase I. The resulting DNA is digested with 
Hindi 1 1 and an 1165 bp fragment containing the neo 
expression unit is gel purified. The 1165 bp neo fragment 
20 and the 1551 bp CMV promoter fragment are ligated, the 

ligation products are digested with Hindlll and the 2716 bp 
Hindlll fragment, resulting from blunt-end ligation of the 
two fragments, is gel purified. The 2716 bp Hindlll 
product is ligated to Hindlll digested plasmid pBSIISK* 
25 (Stratagene Inc., La Jolla, CA) and electroporated into E. 
coll. Colonies containing inserts in the Hindlll site of 
pBSIISK* are analyzed by restriction enzyme analysis to 
confirm the orientation of the insert. One recombinant 
plasmid in which the CMV promoter is oriented such that the 
30 oligo 5.2 sequences (+946 relative to the CMV IE CAP site) 
are proximal to the Sail recognition sequence in the 
pBSIISK* polylinker, is identified and designated pCNl . 



Oligo 5.1 (SEQ ID NO: 19) 
5' GACATTGATT ATTGACTAGT T 
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Oligo 5.2 (SEQ ID NO: 20) 

5' TTTAAGCTTC TGCAGAAAAG ACCCATGGAA AG 

Next, the dhfr expression unit is inserted at a Clal 
site which is located at the 3' end of the neo gene of 
5 pCNl. The dhfr expression unit is obtained by EcoRI and 
Sail digestion of plasmid pF8CIS9080 (Eaton et al., 
Biochemistry 25:8343-8347 (1986)). The resultant 2 kb 
fragment is purified from the digest and made blunt with 
the Klenow fragment of E. coli DNA polymerase I. A Clal 

10 linker (5' CCATCGATGG (NEB 1088; New England Biolabs, 

Beverly, MA) is ligated to the blunt -end dhfr fragment and 
the ligation products are digested with Clal . pCNl is 
digested with Clal, and the Clal dhfr containing fragment 
is ligated into Clal site of pCNl. An aliquot of the 

15 ligation reaction is electroporated into E . coli and 
colonies harboring inserts in a Clal site of pCNl are 
analyzed by restriction enzyme analysis to determine the 
site of insertion and the orientation of the insert. A 
plasmid with the dhfr expression unit at the 3' end of the 

20 neo gene and with the same transcriptional orientation as 
that of the neo gene is identified and designated pCNDl . 

Plasmid pDNasel is constructed as follows: Based on 
the restriction map of the upstream region of the DNase 1 
gene (Figure 9), a 664 bp BamHI fragment (-1161 to -498 in 

25 figure 8) can be isolated from subclone pBS/4C. 2Hinc2 . 
This fragment is ligated to BairiHI digested plasmid 
pBSIISK+dApal (modification of pBSIISK*; Stratagene Inc., 
La Jolla, CA) in which the Apal recognition sequence in the 
polylinker is destroyed. pBSIISKMApal is constructed by 

30 digesting pBSIISK** with Apal, conversion of the 

cohesive-ends to blunt -ends with T4 DNA polymerase and 
ligation to generate the circular plasmid. Following 
ligation of the 664 bp BairiHI fragment into pBSIISK+dApal , 
the ligation products are electroporated into E. coli cells 
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to generate pBS-DNasel. The sequences contained in this 
fragment reside upstream of DNa.Be I exon 1, position -1162 
to -498 with respect to the AUG translational initiation 
codon (nucleotide +1) . The activation cassette which 
5 contains the CMV immediate -early (IE) promoter region, the 
CMV IE CAP site, a non-coding exon, an unpaired splice 
donor site, the neomycin phosphotransf erase (neo) 
selectable marker gene and dhfr expression unit (to select 
for amplification in targeted human cells) is cloned into 
10 the unique Apal site of the 664 bp BamHI fragment (DNase I 
upstream region) in pBS-DNasel (see Figure 12) . 
Specifically, plasmid pCNDl which contains the activation 
cassette, is digested with Sail which cuts downstream of 
the dhfr expression unit and Espl which cuts 242 bp 
15 downstream of the CMV IE CAP site. A 3,955 bp Sall-Espl 
fragment containing the activation cassette is purified 
from this digest and the cohesive-ends are made blunt by 
treatment with the Klenow fragment of E. coll DNA 
polymerase I. This fragment is ligated to plasmid 
20 pBS-DNasel, which has been digested with Apal and made 
blunt-ended by treatment with T4 DNA polymerase I, and 
electroporated into S. coll. Colonies containing inserts 
of the activation cassette inserted at the blunt -ended Apal 
site of pBS-DNase 1 are analyzed by restriction enzyme 
25 analysis to confirm the orientation of the insert. One 

recombinant plasmid in which the CMV promoter is oriented 
such that the direction of transcription is towards DNase I 
exon 1 is identified and designated pDNasel. 

Plasmid pDNasel is digested with BairiKI for 
30 transfection into human cells. Transfection of primary, 
secondary, or immortalized human cells and isolation of 
homologously recombinant cells expressing DNase I may be 
accomplished using the methods described in U.S. Serial No. 
08/243,391 and incorporated herein by reference. 
3 5 Homologously recombinant cells may be identified by PCR 
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screening strategy as exemplified therein and in published 
methods available to one skilled in the art (see, for 
example, Kim, H-S and Smithies, O., Nucl. Acids Res. 
16:8887-8903 (1988)), The identification of cells 
5 expressing DNase I may also be accomplished using a variety 
of assays based on the structure or properties of DNase I . 
For example, DNase I may be functionally identified by an 
in vitro enzyme assay (cf . Kunitz, J. Gen. Physiol. 33: 34 9 
(1950); McDonald, Meth. Enzymol. 2:437 (1955)) or by the 
10 use of anti-DNase I antibodies in an ELISA assay. 

The isolation of cells containing amplified copies of 
the amplifiable marker gene and the activated DNase I locus 
is performed as described in U.S. Serial No-: 07/985,586 
incorporated herein by reference. 

15 EXAMPLE 6: Cloning of the Human S- Interferon Gene and 

Identification of the 5' Flanking Sequences 
The human &- interferon gene was isolated from a human 
genomic DNA library. The library (Clontech, Palo Alto, CA; 
Cat. #HL1006d) was constructed by cloning Mbol partially 
20 digested male leukocyte DNA into the Ba/nHI site of the 

bacteriophage lambda vector EMBL3 . For library screening, 
a DNA probe was isolated by PCR amplification of human 
genomic DNA using oligonucleotides 6.1 and 6.2 

Oligo 6.1 (SEQ ID NO: 21) 
25 5' TGCTCTGGCA CAACAGGTAG 

Oligo 6.2 (SEQ ID NO: 22) 
5' CATAGATGGT CAATGCGGC 

These primers were designed based on the published 
lS-interf eron mRNA sequence (May, L.T. and Sehgal, P.B., J\ 
30 Xnterjferon Res. 5:521-526 (1985)). The amplified probe 



WO 96/29411 PCI7US96/03377 

-58- 

(probe A; 290 bp) was labeled with 32 P-dCTP by PCR and used 
to screen a bacteriophage lambda genomic DNA library- The 
filters wer hybridized for 16 hours at 68 # C in 125 mM 
Na 2 HP0 4 (pH 7.2), 250 mM NaCl, 10% PEG 8000, 7% SDS, 1 mM 
5 EDTA. Filters were washed two times in 500 ml of 20 mM 
Na 2 HP0 4 (pH 7.2), 5% SDS, 1 mM EDTA, followed by 4 washes 
in 500 ml of 20 mM Na 2 HP0 4 (pH 7.2), 1% SDS, 1 mM EDTA. 
The wash buffers were preheated to 56 *C and washing was 
performed at room temperature on a rotary shaker for 

10 approximately 5 minutes per wash. The hybridization 

signals were visualized by autoradiography at -80 # C with an 
intensifying screen. In this experiment, approximately 1 X 
10 6 phage were screened and 6 positive signals were 
obtained. Bacteriophage plaques corresponding to the 

15 positive signals were plated at low density and subjected 
to a second round of screening using probe A. Five of the 
phage (designated la, 2a, 2b, 11a, and 12a) gave positive 
hybridization signals following the secondary screening and 
were retained for further analysis. DNA was isolated from 

20 the plaque purified phage following amplification and 

subsequent purification by cesium chloride gradient ultra 
centrifugation (Yamamoto, K.R. et al., Virology 40:734 
(1970)). Library screening, plaque purification of 
recombinant bacteriophage and isolation of bacteriophage 

25 DNA was performed using standard methods (Ausubel et al . , 

Current Protocols in Molecular Biology. Wiley, New York, NY 
(1987) ) . 

Based on restriction enzyme digestion and Southern 
blot analysis using probe A, all five of the phage (la, 2a, 

30 2b, lla, and 12a) were shown to contain a common Hindlll 
fragment of approximately 10 kb which encompasses the 
entire sequence coding for S- interferon (561 bp), 666 bp of 
3' untranslated sequence and approximately 9 kb of 
non- transcribed DNA lying upstream of the £- interferon 

3 5 gene. This fragment was isolated from one genomic clone 
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(la) and subcloned into pBSIISK* (Stratagene Inc., La 
Jolla, CA) for further analysis. The resultant clones, 
pBS-H3 /Bint. 11-3 and pBS-H3 /Bint .11-21, harbor the 10 kb 
Hindlll fragment in opposite orientations with respect to 
5 the plasmid backbone. Restriction enzyme mapping was used 
to generate the restriction map shown in Figure 13 . The 
nucleotide sequence of 8,355 bp of DNA lying upstream of 
the previously reported sequence (Genbank entry HUMIFNB1F) 
is shown in Figure 14 (SEQ ID NO: 23) . The nucleotide 

10 sequence corresponding to 356 bp of DNA upstream of the 

S- Interferon coding region, the IS- interferon coding region, 
and 666 bp of 3' untranslated sequence is shown in Figure 
15 (SEQ ID NO: 24) . Comparison of the cloned genomic 
sequence presented here, with the published cDNA sequence 

15 (May, L.T. and Sehgal, P.B., Jl Interferon Res. 5:521-526 
(1985) ) confirms that the 15 -interferon gene consists of a 
561 bp coding region which is co- linear with its cognate 
mRNA (lacks introns) . The 15- Interferon gene encodes a 21 
amino acid signal sequence and a 120 amino acid mature 

20 peptide, beginning with an AUG translational initiation 
codon which lies 82 bp downstream of the CAP site. 

EXAMPLE 7: Construction of Targeting Plasmids for 
Activation and Amplification of the 
S- Inter ferpn G^ng 
25 The activation of the £- interferon gene can be 

accomplished by the strategy outlined in Figure 16. In 
this strategy, a targeting fragment is introduced into the 
genome of recipient cells for replacement of the endogenous 
fc- interferon regulatory region with an exogenous regulatory 
30 region, a non-coding exon, an intron, and . chimeric exon 
sequences consisting of sequences from a noncoding exon 
(derived from exon 2 of the CMV IE gene) and sequences from 
the 15 -interferon 5' noncoding region. Specifically, the 
targeting construct from which this fragment is derived 
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(pIFNS-1) is designed to include a 5' targeting sequence 
homologous to sequences upstream of the B- interferon gene, 
a selectable marker gene, an amplifiable marker gene, a 
regulatory region, a CAP site, a non-coding exon, an 
5 intron, chimeric exon sequences consisting of CMV IE exon 2 
sequences and B-interferon 5' noncoding DNA, and a 3' 
targeting sequence homologous to DNA upstream of the 
B-interferon coding region- According to this strategy, 
integration of the targeting construct by homologous 
10 recombination generates recombinant cells producing an mRNA 
precursor which includes the non-coding exon introduced 
upstream of the B-interferon gene, an intron, the chimeric 
exon which fuses CMV IE exon sequences to 5 -interferon 5' 
noncoding sequences and the entire B -interferon coding 
15 region, and 3' untranslated regions of the B-interferon 
gene {Figure 16) . The chimeric exon consists of 17 bp of 
CMV IE exon 2 (position 172,782 to 172,766 of EMBL sequence 
X17403) joined to the 5' flanking region of the 
B-interferon gene (position -173 with respect to the AUG 
20 translational initiation codon) . Splicing of this 
transcript results in the fusion of the exogenous 
non- coding exon to exon 2 which includes the complete 
coding sequence of the endogenous B-interferon gene, 
fi- interferon is produced by translation of the mature mRNA. 
25 According to this strategy, the 5' targeting sequence is 

upstream of the endogenous target gene and the 3' targeting 
sequence is in the B-interferon 5' noncoding region. The 
position of the regulatory region relative to the 5' 
flanking sequence, may be varied (e.g. by altering the size 
3 0 of the intron in the targeting construct) to optimize the 
function of the regulatory region. 

Plasmid pIFNS-l is constructed as follows: A 182 bp 
fragment (size includes a 9 bp synthetic BairiHI recognition 
site at the 5' end of Oligo 7.1) is amplified from 
3 5 P BS-H3/Bint.ll-3 using oligos 7.1 and 7.2. The amplified 
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fragm nt serves as the 3' targeting sequence (Figure 16). 
Oligo 7.1 (21 bp, SEQ ID NO: 25) hybridizes to the 
£- interferon 5' non- transcribed region at position -173 
with respect to the B- interferon AUG translational 
5 initiation codon (Figure 15) . Oligo 7.2 (30 bp, SEQ ID NO: 
26) contains 21 nucleotides which hybridize to the 
B-interferon 5' untranslated region at position -1 relative 
to the AUG translational start codon (see Figure 16) , with 
the additional 9 bp at the 5' end of the oligo creating a 

10 synthetic BairiHI recognition sequence. The 182 bp PCR 
product is purified and used in the ligation described 
below. Next, a 1571 bp (size includes an 8 bp synthetic 
SmaT recognition sequence at the 5' end of oligo 7.3) 
fragment is amplified using oligos 7.3 and 7.4. The 

15 amplified fragment encompasses -the CMV IE promoter, CMV IE 
exon 1 (non- coding exon) , CMV IE intron 1 and 17 bp of CMV 
IE exon 2, beginning at nucleotide 174,328 and ending at 
nucleotide 172,766 of EMBL sequence X17403 (Human 
cytomegalovirus strain AD 169) . (The source of the CMV IE 

20 gene is not critical, and CMV IE promoter-based plasmids or 
wild type CMV DNA may be used) . Oligo 7.3 (29 bp, SEQ ID 
NO: 27) contains 21 nucleotides which hybridize to the CMV 
IE promoter at -598 relative to the CAP site (EMBL sequence 
X17403) , the 5' end of the oligo also contains a 8 bp 

25 synthetic Smal recognition sequence. Oligo 7.4 (21 bp, SEQ 
ID NO: 28) hybridizes to the CMV IE promoter at +965 
relative to the CAP site. The 1571 bp PCR product 
containing the CMV IE promoter, CMV IE exon 1, CMV IE 
intron 1 and 23 bp of CMV IE exon 2, is gel purified and 

30 ligated to the 182 bp fragment containing the B-interferon 
5' flanking region. The ligation products are digested 
with BairiHI and Smal, and the 1742 bp Smal-Ba/riHI fragment, 
resulting from ligation of B-interferon sequences (position 
-173 with respect to the AUG translational initiation 

35 codon) to CMV IE sequences (-598 relative to the CMV IE CAP 
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site) , is gel purified. The 1742 bp Smal-BairiHI fragment is 
ligated to SajriHI and Sinai digested plasmid pBSIISK* 
(Stratagene Inc., La Jolla, CA) and electroporated into E. 
coll. Colonies containing inserts in pBSIISK + are analyzed 
5 by restriction enzyme analysis to confirm the structure of 
the insert. One recombinant plasmid is identified and 
designated pBS-CB. 

Oligo 7.1 (SEQ ID NO: 25) 
5' TGACATAGGA AAACTGAAAG G 

10 Oligo 7.2 (SEQ ID NO: 26) 

5' TTTGGATCCG TTGACAACAC GAACAGTGTC G 

Oligo 7.3 (SEQ ID NO: 27) 

5' TTTCCCGGGA CATTGATTAT TGACTAGTT 

Oligo 7.4 (SEQ ID NO: 28) 
15 5' CGTGTCAAGG ACGGTGACTG C 

The neomycin phosphotransferase (neo) gene is isolated 
from plasmid pBSneo for use as a selectable marker for the 
isolation of stably transfected human cells. The neo gene 
in plasmid pBSneo was obtained by BamHI and Xhol digestion 

20 of pMClneo-polyA (Thomas, K.R. and Capecchi, M.R. , Cell 
51:503-512 (1987)) . Plasmid pMClneo-polyA was digested 
with BazriHI and made blunt ended with the Klenow fragment of 
E. coll DNA polymerase I. The resulting DNA was digested 
with Xhol, and the blunt-ended BamHI-XhoI fragment was 

25 cloned into Hindi and Xhol digested plasmid pBSIISK*. For 
isolation of the neo gene harbored on pBSneo, plasmid 
pBSneo is digested with Xhol and made blunt -ended by 
treatment with the Klenow fragment of E . coll DNA 
polymerase I. The resulting DNA is digested with Hindlll 

30 and a 1165 bp fragment containing the neo expression unit 
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is gel purified. The 1165 bp fragment is ligated to Smal 
and HindlXI digested plasmid pBS-CB and electroporated into 
E. col±. Colonies containing inserts in pBS-CB are 
analyzed by restriction enzyme analysis to confirm the 
5 orientation of the insert. One recombinant plasmid is 
identified and designated pBS-CBN. 

Next, the dhfr expression unit is inserted at the Clal 
site which is located at the 3' end of the neo gene of 
pBS-CBN. The dhfr expression unit is obtained by EcoRl and 

10 Sail digestion of plasmid pF8CIS9080 (Eaton et al . , 

Biochemistry 25:8343-8347 (1986)). The resultant 2 kb 
fragment is purified from the digest and made blunt with 
the Klenow fragment of E. coli DNA polymerase I . A Clal 
linker (5' CCATCGATGG ; NEB 1088, New England Biolabs, 

15 Beverly, MA) is ligated to the -blunt -end dhfr fragment, the 
ligation products are digested with Clal and purified. The 
Clal dhJfr containing fragment is ligated into Clal digested 
plasmid pBS-CBN. An aliquot of the ligation reaction is 
electroporated into E. coli and colonies harboring inserts 

20 in a Clal site of pBS-CBN are analyzed by restriction 

enzyme analysis to determine the site of insertion and the 
orientation of the insert. A plasmid with the dhfr 
expression unit at the 3 ' end of the neo gene and with the 
same transcriptional orientation as that of the neo gene is 

25 identified and designated pBS-CBND. 

Finally, the targeting construct is constructed by 
insertion of the 5' targeting sequence (Figure 16) in the 
unique Sail site located at the 3' end of the dhfr 
expression unit in plasmid pBS-CBND. To obtain the 5' 

30 targeting sequence, the plasmid pBS-H3/Bint . 11 -3 is 

digested with EcoRI and PvuII and the resultant 1.2 kb 
fragment is purified, ligated to EcoRl-Smal digested 
plasmid pBSIISK* (Stratagene Inc., La, Jolla, CA) and 
electroporated into E. coli. Colonies containing inserts 

3 5 in pBSIISK** are analyzed by restriction enzyme analysis, 
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and one plasmid containing the insert is retained and 
designated pBS-BI5. Plasmid pBS-BI5 is digested with Spel 
and KcoRV and made blunt-ended with the Klenow fragment of 
DNA polymerase I. The resulting 1.2 kb fragment is ligated 
5 to Sail digested plasmid pBS-CBND, which has been made 
blunt -ended with the Klenow fragment of E. coli DNA 
polymerase I- An aliquot of the blunt-end ligation 
reaction is electroporated into E. coli and colonies 
harboring inserts in the Sail site of pBS-CBND are analyzed 
10 by restriction enzyme analysis to determine the orientation 
of the insert. A plasmid with the EcoRI site at the 3' end 
of the dhfr expression unit is identified and designated 
pIFNfi-1. 

Plasmid pIFNS-1 is digested with BaniRl for 
IS transfection into human cells. 1 Transfection of primary, 
secondary, or immortalized human cells and isolation of 
homologously recombinant cells expressing £- interferon may 
be accomplished using the methods described in U,S. Serial 
No. 08/243,391 and incorporated herein by reference. 

2 0 Homologously recombinant cells may be identified by PCR 

screening strategy as exemplified therein and in published 
methods available to one skilled in the art (see, for 
example, Kim, H-S and Smithies, 0., Nucl . Acids Res. 
25:8887-8903 (1988)). The identification of cells 
25 expressing fi- interferon may also be accomplished using a 
variety of assays based on the structure or properties of 
^-interferon. For example, ^-interferon may be identified 
by an in vitro reverse passive hemagglutination assay 
(Accurate Chemical Corp., Westbury, NY), stimulation of 

3 0 superoxide anion production by mouse peritoneal macrophages 

(Colligan, J. E . et al . Current Protocols in Immunology, 
Wiley, New York, NY. (1994), or by using ant i-fc- interferon 
antibodies in an ELISA assay. 

The isolation of cells containing amplified copies of 
3 5 the amplifiable marker gene and the activated S- interferon 
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locus is performed as described in U.S. Serial No.: 
07/985,586 incorporated herein by reference. 

Equivalents 

Those skilled in the art will recognize, or be able to 
5 ascertain using not more than routine experimentation, many- 
equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be 
encompassed by the following claims. 
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CLAIMS 



1. A method for controlling (e.g. altering) the 
expression of a structural gene in a cell 
comprising the steps of: 
5 (a) providing a DNA construct comprising a 

targeting sequence, a regulatory sequence and 
a splice donor site; 
(b) establishing an intervening DNA sequence 
between the regulatory sequence and the 
10 structural gene by inserting the construct 

into the cell by homologous recombination at a 
preselected position relative to the 
structural gene to produce a homologously 
recombinant cell in which the inserted 
15 construct adopts a configuration whereby the 

regulatory sequence is separated from the 
structural gene by a preselected length of 
intervening DNA, the splice donor site being 
positioned such that cognate RNA of the 
20 intervening DNA is removed during post- 

transcriptional splicing of the primary 
transcript; and 
(c) controlling the expression of the structural 
gene by varying the length of the intervening 
25 DNA selected in step (b) . 

2. A DNA construct for use in the method of Claim 1 
and capable of altering the expression of a gene 
encoding thrombopoietin when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
3 0 construct comprising: 

(a) a targeting sequence comprising DNA which 

hybridizes to genomic DNA within or upstream 
of the thrombopoietin gene; 
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(b) a regulatory sequence; 

(c) an exon? and 

(d) an unpaired splice-donor site. 

3 . The DNA construct of Claim 2 wherein the regulatory 
5 sequence comprises a promoter. 

4 . The DNA construct of Claim 2 or Claim 3 further 
comprising a selectable marker gene. 

5. The DNA construct of any one of Claims 2-4 further 
comprising am amplifiable marker gene. 

10 6. The DNA construct of any one of Claims 2-5 further 
comprising a second targeting sequence comprising 
DNA which hybridizes to genomic DNA within or 
upstream of the thrombopoietin gene. 

7. The DNA construct of any one of Claims 2-6 wherein 
15 the targeting sequence is selected from the group 

consisting of SEQ ID NO: 3, SEQ ID NO: 4 or 
fragment thereof or a sequence which hybridizes to 
a sequence selected from the group consisting of 
SEQ ID NO: 3, SEQ ID NO: 4 or fragments thereof. 

20 8. The DNA construct of Claim 7 wherein the targeting 
sequence is a fragment of SEQ ID NO: 3 and is at 
least about 20 base pairs. 

9. The DNA construct of Claim 7 wherein the targeting 
sequence is a fragment of SEQ ID NO: 4 and is at 
25 least about 20 base pairs. 



10. 



The DNA construct of Claim 9 wherein the targeting 
sequence is at least about 20 base pairs and is a 
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seguence between about nucleotides -1815 to -145 , 
14 to 245, or 374 to 570 of Figure 5 (SEQ ID NO: 
4) . 

An isolated DNA molecule for use as part of the 
construct of any one of Claims 2-10 being of at 
least about 20 base pairs and selected from the 
group consisting of SEQ ID NO: 3, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 3. 

An isolated DNA molecule for use as part of the 
construct of any one of Claims 2-10 being of at 
least about 20 base pairs and selected from the 
group consisting of a sequence between about 
nucleotides -1815 to -145, 14 to 245, or 374 to 570 
of Figure 5 (SEQ ID NO: 4) , and a sequence which 
hybridizes to a sequence between about nucleotides 
-1815 to -145 # 14 to 245, or 374 to 570 of Figure 5 
(SEQ ID NO: 4) . 

A method of producing a homologously recombinant 
cell wherein the expression of the thrombopoietin 
gene is altered, comprising the steps of: 

(a) transfecting a cell containing the 
thrombopoietin gene with the DNA construct of 
one of Claims 2-10; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination . 

A homologously recombinant cell produced by the 
method of Claim 13. 
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15. A homologously recombinant cell obtainable by the 
method of Claim 1 which expresses thrombopoietin 
comprising an exogenous regulatory r gion, an 
exogenous exon, and an exogenous unpaired splice - 

5 donor site operatively linked to an endogenous 

splice acceptor site of the thrombopoietin gene. 

16 . The homologously recombinant cell of Claim 15 
wherein the exogenous regulatory region, the 
exogenous exon, and the exogenous unpaired splice - 

10 donor site are operatively linked to the endogenous 

splice acceptor site of the second or third exon of 
the thrombopoietin gene. 

17. A method for producing thrombopoietin comprising 
the steps of maintaining the homologously 

15 recombinant cell of any one of Claims 14 to 16 

under conditions appropriate for the production of 
thrombopoietin . 

18 . A method for producing thrombopoietin wherein the 
expression of the thrombopoietin gene is altered, 

2 0 comprising the steps of: 

(a) transfecting a cell containing the 
thrombopoietin gene with the DNA construct of 
one of Claims 2-10; and 

(b) maintaining the transfected cell under 
25 conditions appropriate for homologous 

recombination; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of 

3 0 thrombopoietin. 
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A thrombopoietin produced by the method of Claim 17 
or 18. 

A pharmaceutical composition comprising the 
throtnbopoietin of Claim 19. 

A method of providing thrombopoietin to a mammal in 
need thereof comprising administering homologously 
recombinant cells of any one of Claims 14 to 16 in 
sufficient number to produce a therapeutically 
effective amount of thrombopoietin in the mammal. 

A DNA construct for use in the method of Claim 1 
capable of altering the expression of a gene 
encoding DNase I when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising: 

(a) a targeting sequence comprising DNA which 
hybridizes to genomic DNA within or upstream 
of the DNase I gene; 

(b) a regulatory sequence; 

(c) an exon; and 

(d) an unpaired splice-donor site. 

The DNA construct of Claim 22 wherein the 
regulatory sequence comprises a promoter. 

The DNA construct of • Claim 22 or 23 further 
comprising a selectable marker gene. 

The DNA construct of any one of Claims 22-24 
further comprising an amplifiable marker gene. 

The DNA construct of any one of Claims 22-25 
further comprising a second targeting sequence 
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comprising DNA which hybridizes to genomic DNA 
within or upstream of the DNase I gene. 

27. The DNA construct of any one of Claims 22-26 
wherein the targeting sequence is selected from the 

5 group consisting of SEQ ID NO: 17, SEQ ID NO: 18 or 

fragments thereof or a sequence which hybridizes to 
a sequence selected from the group consisting of 
SEQ ID NO: 17, SEQ ID NO: 18 or fragments thereof. 

28. The DNA construct of Claim 27 wherein the targeting 
10 sequence is a fragment of SEQ ID NO: 17 and is at 

least about 20 base pairs. 

29. The DNA construct of Claim 27 wherein the targeting 
sequence is a fragment of SEQ ID NO: 18 and is at 
least about 20 base pairs. 

15 30. The DNA construct of Claim 29 wherein the targeting 
sequence is at least about 20 base pairs and is a 
sequence between about nucleotides -328 to -2 of 
Figure 11 (SEQ ID NO: 18) . 

31. An isolated DNA molecule for use as part of the 
20 construct of any one of Claims 22-3 0 being of at 

least about 2 0 base pairs and selected from the 
group consisting of SEQ ID NO: 17, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 17. 

25 32. An isolated DNA molecule for use as part of the 

construct of any one of Claims 22 to 3 0 being of at 
least about 20 base pairs and selected from the 
group consisting of a sequence between about 
nucleotides -328 to -2 of Figure 11 (SEQ ID NO: 18) 
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and a sequence which hybridizes to a sequence 
betw n about nucleotides -328 to -2 of Figure 11 
(SEQ ID NO: 18) . 

A method of producing a homologously recombinant 
cell wherein the expression of the DNase I gene is 
altered, comprising the steps of : 

(a) transfecting a cell containing the DNase I 
gene with the DNA construct of one of Claims 
22-30; and 

(b) maintaining the trans feet ed cell under 
conditions appropriate for homologous 
recombinat ion . 

A homologously recombinant* cell produced by the 
method of Claim 33. 

A homologously recombinant cell obtainable by the 
method of Claim 1 which expresses DNase I 
comprising an exogenous regulatory region, an 
exogenous exon, and an exogenous unpaired splice- 
donor site operatively linked to an endogenous 
splice acceptor site of the DNase I gene. 

The homologously recombinant cell of Claim 35 
wherein the exogenous regulatory region, the 
exogenous exon, and the exogenous unpaired splice- 
donor site are operatively linked to the endogenous 
splice acceptor site of the second exon of the 
DNase I gene. 

A method for producing DNase I comprising the steps 
of maintaining the homologously recombinant cell of 
any one of Claims 34 to 3 6 under conditions 
appropriate for the production of DNase I. 
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A method for producing DNase I wher in the 
expression of the DNase I gene is altered, 
comprising the steps of: 

(a) transfecting a cell containing the DNase I 
gene with the DNA construct of one of Claims 
22-30; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of DNase I . 

A DNase I produced by the method of Claim 37 or 38. 

A pharmaceutical composition comprising the DNase I 
of Claim 39. 

A method of providing DNase I to a mammal in need 
thereof comprising administering homologously 
recombinant cells of any one of Claims 34 to 3 6 in 
sufficient number to produce a therapeutically 
effective amount of DNase I in the mammal. 

A DNA construct for use in the method of Claim 1 
and capable of altering the expression of a gene 
encoding fi- interferon when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising: 

(a) a targeting sequence comprising DNA which 
hybridizes to genomic DNA within or upstream 
of the fc- interferon gene; 

(b) a regulatory sequence; 

(c) an exon; 

(d) a splice-donor site; 
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(e) an intron; and 

(f) a splice-acceptor site. 

43. The DNA construct of Claim 42 wherein the 
regulatory sequence comprises a promoter. 

5 44. The DNA construct of Claim 42 or 43 further 
comprising a selectable marker gene. 

45. The DNA construct of any one of Claims 42-44 
further comprising an amplifiable marker gene. 

46. The DNA construct of any one of Claims 42-45 
10 further comprising a second targeting sequence 

comprising DNA which hybridizes to genomic DNA 
within or upstream of the G- interferon gene. 

47. The DNA construct of Claim 42 wherein the targeting 
sequence is selected from the group consisting of 

15 SEQ ID NO: 23, SEQ ID NO: 24 or fragments thereof 

or a sequence which hybridizes to a sequence 
selected from the group consisting of SEQ ID NO: 
23 , SEQ ID NO: 24 or fragments thereof. 

48. The DNA construct of Claim 47 wherein the targeting 
20 sequence is a fragment of SEQ ID NO: 23 and is at 

least about 2 0 base pairs'. 

49. The DNA construct of Claim 47 wherein the targeting 
sequence is a fragment of SEQ ID NO: 24 and is at 
least about 20 base pairs. 



25 50 . 



An isolated DNA molecule for use as part of the 
construct of any one of Claims 42-49 being of at 
least about 20 base pairs and selected from the 
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group consisting of SEQ ID NO: 23, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 23. 

A method of producing a homologous ly recombinant 
cell wherein the expression of the fi-interferon 
gene is altered, comprising the steps of: 

(a) transfecting a cell containing the £- 
interferon gene with the DNA construct of one 
of Claims 42-49; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombinat ion . 

A homologously recombinant cell produced by the 
method of Claim 51. 

A homologously recombinant cell obtainable by the 
method of Claim 1 which expresses 15- interferon 
comprising an exogenous regulatory region, an 
exogenous exon, an exogenous splice-donor site, and 
exogenous intron and an exogenous splice acceptor 
site operatively linked to the S- interferon gene. 

A method for producing ^-interferon comprising the 
steps of maintaining the homologously recombinant 
cell of Claim 52 or 53 under conditions appropriate 
for the production of £- interferon. 

A method for producing £- interferon wherein the 
expression of the £- interferon gene is altered, 
comprising the steps of: 

(a) transfecting a cell containing the S- 

interferon gene with the DNA construct of one 
of Claims 42-49; and 
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(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination; and 

(c) maintaining the homologously recombinant cell 
5 produced in step (b) under conditions 

appropriate for the production of S- 
interf eron. 

56. A &- interferon produced by the method of Claim 54 
or 55. 



10 57. A pharmaceutical composition comprising the &- 
interferon of Claim 56. 

58. A method of providing £- interferon to a mammal in 
need thereof comprising administering homologously 
recombinant cells of Claim 52 or Claim 53 in 

15 sufficient number to produce a therapeutically 

effective amount of S- interferon in the mammal. 

59. The DNA construct of any one of Claims 2-10, 22-30 
or 42-49, isolated DNA of any one of Claims 11-12, 
31-32, or 50, cell of any one of Claims 14-16, 34- 

20 36 or 52-53, thrombopoietin of Claim 19, DNase of 

Claim 39, 6- interferon of Claim 56, or 
pharmaceutical composition of Claims 20, 4 0 or 57 
for use in therapy, for example in: 
(a) gene therapy; 
25 (b) providing TPO to a mammal by introducing 

homologously recombinant cells into the mammal 
in a sufficient number to produce an effective 
amount of TPO in the mammal; 
(c) administering homologously recombinant cells 
3 0 expressing DNase I to the trachea and lungs of 

a cystic fibrosis patient to effect in vivo 
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secretion of DNase I for the relief of 
respiratory distress; 

(d) implanting homologously recombinant cells 
expressing S- interferon into a patient 

5 suffering from multiple sclerosis to effect in 

vivo secretion of £- interferon to diminish 
exacerbations associated with the disease; 

(e) the delivery of TPO, £-interf eron or DNase I 
to a patient comprising the steps defined in 

10 Claim 18, 38 or 55, 

60. A graft (e.g. an autograft, allograft or xenograft) 
comprising the DNA contruct of any one of Claims 2- 
10, 22-30 or 42-49, isolated DNA of any one of 
Claims 11-12, 31-32, or 50, cell of any one of 

15 Claims 14-16, 34-36 or 52-53, thrombopoietin of 

Claim 19, DNase of Claim 39 or S-interf eron of 
Claim 56. 

61. The graft of Claim 60 for use in therapy, e.g. in 
the therapies recited in Claim 59 (a) to (e) . 

20 62 . A pharmaceutical composition or device comprising 
the DNA construct of any one of Claims 2-10, 22-30 
or 42-49, isolated DNA of any one of Claims 11-12, 
31-32, or 50, cell of any one of Claims 14-16, 34- 
36 or 52-53, thrombopoietin of Claim 19, DNase of 

25 Claim 3 9 or £- interferon of Claim 56, the 

composition or device for example further 
comprising a barrier device, a nebulizer, an 
atomizer or being in a form suitable for delivery 
by oral, intravenous, intramuscular, intranasal, 

3 0 antratracheal or subcutaneous routes. 
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Xbal (-6372) 
- 6373 'TCTAGAGTCAGGATCGCACTGAAGGTCTCl^ 

-6311 ACCCTCCCCCCTTTCCTGGG^ 

Apal (-6233) 
-6249 CCCTACCTGCAGCCAGGGO^^^ 

-6187 GTGTC^GAAGTGCCACATGCAGCTTjIT^ 

-6125 GCCCGCCACACCCC^CAC^^ 

-6063 CAGGCTAGGCCAATTAGGATGCCCAGGCAG 

Hindltl (-5985) 

-5939 GCTGCACCACTICCTAGCTGTCTGACCTT^ 

-5877 TCCCXXTrTCTCTAAAATG^^ 

-5815 GACCACQGGAGGCAATGCAGAGCATC^ 

-5753 AATGGCATCATCTCACCAGGCCTATCTTXXSC^^ 

BamHI (-5667) 
-5691 ACTGCCATTCX^GTCTC^GAAGCGGA 

-5629 GGGTGAGGCCGGACTGAGCCAAAAGCAGCCCCT^^ 

-5567 CGGCAGCGTGACCCCTCCTTKXTrcC 

-5505 GAGGCTAGAGCGCCAGCAGCGAGACTCGGCTCGTGCCACCGCCTGCG^ 

-5443 GCAGCGCCACGAAGTCTGGGACGGGAGGAAGATCGCCT^ 

-5381 TGGCCCAGCCTCAACCACAACCGCGCTCnTCGC 

Apal (-5318) 
-5319 GGGCCCGCTCCTCAO^C^^ 

-5257 GGA1 

-5195 GTAAGAACACX3GGCTTCAGCTGGCCATCGGAAAGGCCAGTCC 
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-5133 CGGGACCTAGTA1 



-5071 AGGTC^^^ 

-5009 GGTTTGATGTTCTTGCAGCTGACC^ 
-4947 TCCCGGAAAAGGCGGGAAACCO^ 
-4885 AAGGATGTCCCXX^AGTCTUSCC^^ 

-4761 TAGCAAGGCTC^CATGA^^ 

-4699 CACAGAGTGGGCGATCAGTAACAGCAC^^ 

"4575 ACGCCXXXXXX^CCCGGC^ 
"4513 GCCTCGCAGGCCACAGCACGCAGCGC^ 
-4451 GTCTCGTO^GGCATAGACCT^^ 
-4389 GCGCTGCCCAGCraX^CCGTGTGCCG^ 
-4327 ACCXSCGCCCTTCTCCCCCCG^ 

-4203 GCGCCCACCTACCCTGOT 

-4141 GGGCXCTCTK*^ ( " 4079) 

-4079 GCCGGCGGGGCCGGGAGGGCHCGGCATGACGCGAACGGGA 

-4017 GAGGGCOTGGGAGCCXX^^ 
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Notl (-3885) 
-3893 raTCGGGCGGCCGC^ 

-3831 CGCGGGCGGGCGGGCGCTCACCC« 
-3769 GCCCGTITTATCCCCCGCGCCCG^^ 

-3645 ^TCGCGGGTGGGGGAGGGGAGGGAGGGCTGG^ 
-3583 GGGGAAGGGGGAGC^^ 

-3459 GCCC^GGAAGGGAGCCTCAGGCTAGGGAGGGGCA^ 
-3397 CTX^GCGAGGCCCGGTTCCGCCCGAA 

Apat (-3307) 

-3335 CXnXTCATC^O^CGATC 

-3211 CAGRACAGGGACCTAGCCAGAAACCGGCAGCATT^ 

-3149 CTCTCATIXTTAACITATCCT^ 

-3087 CTTCACCCAAGGGACCCTCTGCCT^ 

-3025 GGTCATGCCTGCCTCCCTG^^ 

-2963 TATCCCAGCACCCTCCTTC^ 

-2901 AGGATCTAGGCCACACTIXriX^GCAGAC^ 

•2839 CCTGAGGAAGTTCTGGGGGACAGGGGGATC1ATGGGATCA 

■2777 GGACAGAGACTCnvSGGGAGACTIXXSGA 

2715 AAGGAAAAGGGGGGCCAGCAGGGWGGTATTTXXX3GGGGAGGTC 
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-2653 GACAGGGACACATGGGCCTGGTTATT^^ 

2591 CGGaGAO^GAA^ 
-2529 CCACTGGACCCCAGCAGACGAGCACCTAAGCrc 
-2467 ACTGTGCCCCGCACCTGACGTCCACTCAA^ 
-2405 ATAACAGGAC^TTTCTCTCA^ 
-2343 AAGATACSGACTCCCTAGGGGATTACAGA^ 
-2281 TCAGCAGCAGGTATGATGTCCAGGGAAAAGAAAT^ 
-2219 CAATCTTAAACAAGACCTCTGTQCTTCTTCCC^^ 
-2157 CTCGAAAAAACTTCTGCIXX^ 
BamHI (-2094) 

-2095 GGATCCCCCTCATCCAAATCTTCTCCGTGTGTGCTGTGGG^ 
-2033 CCAQGCAGQGVGCTCX^AGGGAAGA^ 
-1971 TGGCTCCCTTCTCTGATTGGGCA^ 
-1909 GGGGCTGTGCCXXACCGCCACATG 
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~ 1885 TCTTCCTACCrATC^^ 

-1761 AAATGGGCTCCCAGCTQGGGGAQGGGCAGGCAAACTG^ 
-1699 C^GAAGAGTCTAGCCTTCCCAGAA^ 
-1637 GCTGKTTTCCTGAGGGACTGATC^ 

-1575 AAGGAAAGGGGACATGAGCCCAGGGAGAAAATAAGAGAGGGAG 

-1513 AACACAGTAGTAAGATGGACACAGCCCCAAT^ 

-1451 TTAAGGTTCTGAATCTGGTGCTC^^ 

Apal (-1377) 
-1389 TAATGGGAGGAGGGCCCACTC^TGTTCAC^ 

^CTtSAAAGGAGGAGGA^ 

TATGAGACAGATATGTTA 

GTGGGCGCCTAAGACAAGGTAAGCCCCTAAGGTGGGCA 

kCTGTTAGCCCATCTCTTGGCCTCAGATAATG 



-1327 
-1265 
-1203 
-1141 



-1079 GAGTATTTCAGGACTTCGAGTCCAGAGAAAAGCTCCAGTGGC^^ 
-1017 GGGAAAGAATAGAGGTTAATITCTCCCATACCG^ 

-893 CATATTCCGCCCXnTTGCCAGTTCXTIO 

-831 CCAGGCTGAAGCCACAATACTTTCCTTCTCTATC 

-769 ACCAAGGTTGCTCAGAATTTAAGGCTAATTAAGATATGTG 



-707 



GCTCTCAGCAGGGGTAGG^ 
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-645 JtfXaGAGAOCATAT ^ 

-582 ACGGAGTTTCACTCTTATlX3CCCAGGCTGGtfU3TC 

-456 TGAACCACCACACCCTGCTAGTTITTTTGTATT^ 
-393 AGGCTGGTGGCG^CTCCTGACCTCA 

EcoRI (-268) 

-330 TTACAGGCATGAGCCACTGCACCCGGCACACCATATCX^ 
-267 ATK^GGGCTTTGGCAGTTCCAG^ 
-204 CltXX3K3GCACn\7i\JluCCT?^^ 
-141 AGATTCTO^CXXrrTGGTCCG^ 
"78 GCXXSCCTCCATGGCXX^ 

AUG (1) 

-15 AGACftCCCCGGCCftGA A2S SfiS CDS ACE G GTGAGAACACACCTGAGOGGCTAGOGCC 

43 ATATGGAAACATGACAGAAGCX^ 
106 GGAACCCATTCTCCCAAAAATAAGGGGTCTGAGGG^ 
EcoRI (178) 

169 CXTTCAATGGGAATTCCTGGAATACCAGCTGACAATC 

232 TCTCCTCATCTAAG^A US CIS CTC GTC ATG CTT QTC CTA Ad CCA 

281 AOS OA ACS CIS 2ES AGS CCS ffiZE CCE CCX ©31 CEC CSA SIS 

329 CEC AST AM CIS CUE CGE SAC TCC CSX SEC CEE CAC ACC ASA CIS GTG 
377 AGAACTCCCAACATTATCCCCTTTATCCGCGTAACTGGTA 
440 C^CCATX^CTTCCICTAACTCCTTG^ 

Xbal (562) 

503 GATCACACTCTCTGACAAGGAT^^ 
566 GAACT 
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Hindi (-4511) 
-4512 GTCAACCTTCACAGTAATTCCTTCTTC^^ 

-4448 GATACCCTATAAAGCAAGGTAACGTTAATGTTGAGACC^ 

-4320 CGTGCAGGTTCAAACCACA^^ 
-4256 TCCCTCXXrCTGCACGTCGC^^ 
-4192 AGCGTTCCTITGAGGCCATTTG^ 
-4128 ATTTCAGCaUUVTCAGAGCATClt^ 
"4064 TITCOCTTCCICTG^ 
-4000 GGTCTAAAAACACCTCATCCTGATCTC 
-3936 GCCAGCACCCAT 



Apaf (-3851) 
-3872 CGAGGCCTATCTCCAGAGTCX^^ 

-3808 GGGGCTTCGACCTACAGCTCGACAGCACCCATGGAATGTGC^ 

-3744 CCGCCTTGGCCTTAGGGCGGCACGTC 

-3680 TCGGAAGAGGGTGCCOVGGGAGCTC^ 

-3616 CAGAGCCACCCCAGCAGACCTGGCAGTGTGAGAGAAA 

-3552 TGGCTGTTACATGGCAGCATTGACTGACACAGACAGA 

-3488 GTGCTGGAGACTCCAACAAGCCACAGGCTGCAGGGGCAGGATC 

-3424 IXnTCTGGGAATCTATCAGAGGAAGACATAGAGG 

Apal (-3353) 
-3360 CCAGACGGGCCXTCAICTX^GACX^GGCTCC^ 



-3296 



CTCAGGGCAGCCACACAGCAGGCAGCACT 
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-3232 TCCTTCTGGCAGGTAGAGGAAGCAGGGGCACTA 

-3168 GCATTTGGTCAAGAGCCAOGAGGGGATGACAGACC^ ^ 09 ) 

-3104 CACACGTAGGGGGTTCGG^ 

-3040 GCTGCACCAGGCAGTTICTTGGTGGAG^ 

-2976 GATO3C3TCGCTGTCA^ 

-2912 CACCGAATACTCCGGOXXXXr^ 

-2848 TTAGAGATTAAAAACAGGGAAGAACCAT^ 

"2784 GCAGCCTGAGGAGTGGTGGTGTTTCCAT^ 

-2720 TC^CC^GTGCTGCC^GCCAGAC^^ 

-2656 GTOGftGCXSTOGT^ 

-2592 TGGCCAGACXX^CX^CTTTC^ 

-2528 TCCTGCAGACCCCATTTGTATTCATTICCTGCAGTTC^ 

-2464 GCCAACCGTTCCAGGCCCI^ 

-2400 GGGaZAGCACAGCCCCTTCCAAGTCG^^ 

-2336 AGCCCTGGAACCTCTGAATGTTGATTTTTGTA 

-2272 TGTTGAGATAAGGACATCCTCCCTC5CTCTCTGG 

-2208 AGAAGAAGAGGCAGAGACTCXSGGTGATGCAGCCACAACTAAGC^^ 

-2144 CCTGCAGAAACTGGAGGGCA^ 

BamHJ (-2032) 

-2080 CCTACTGACTCX:CTGACTIX2AGACGTCCAGT^^ 
-2016 TTTAAGCAACCAAACTTGIXKSTAGTTTCACC^^ 
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-1952 AGATTCCAAGAAATGAGTGGCGGGGTGCGGTGGCTCACA 

-1888 GATTGCTroGGCTCAGGACTTOGAGAC^^ 

-1824 CGATCGTCACGCCTGTAATCCCAGCA^ 

-1760 GTTTGAGACCAGTGTGACCAACATCGTGAAACX:CTC 

-1696 GZTGTGGIG^^ 

-1632 CCCAGGAAGCAGAGGFTTGC&G^ 

Sphl (-1509) 

-1568 caagattccgcctcaaaaaaaaaaaaaaaa 
-1504 acctgtggtcctcx^aogccx3<^g 

-1440 AGAGCAAGACCCCATCTCTACCAAAAAAATTTAAAAATTA 

-1376 GTCTTAGCTACTCAGGAGGCTGAGGAGGGA 

-1312 AGCCATGATTTGGCCACTGCACTCCAGCCT^ 

-1248 ATAAAAACCCAAAACAAAAGAACCAAGAAAITACTGGACC^ 

BamHI (-1162) 
-1184 CTGCCCnCK^CTGGTCACTCGGATC 

1056 GGCGCTGGTGCltXIAGGCCXXrCACCACTGeiu 

- 992 CTGCACCTGATGGCGATGAATCAGGAAGGCAGGCGTO 

-928 CAGCCACCAGGGGGCTCCATTTGCTACrT^ 
Apaf (-860) 

-864 TTGGGGCCCCCAGACAAGAGACAGGGAGACTGGAGCCCAGCCC 

-800 CCCATCCCTGCCCTATCCTGGAAGATGGGGGCCACCACACGTRCA 

-736 CTTTOGCCTTGTrATCAGACATT^ 
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-672 CAGCAGCAAG&AACCTCnXSCITACA^ 

-607 ACACAGAGCCATTOTTTTCTGCACTCT^ 

BamHI (-498) 

-542 CTGCCTGAACTTTrAAAACTTC^ 

-477 GCCAGGG 
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CAP site (-469) 
-470 CCTTCAACnX5CTTCT^ 

-408 AGGAGAAAATTGTCATCAAAGGATATTC 

-346 ACATCACCATCATCTCAGGTGAGCACCAGGTGC^^ 

-284 GCAGGGAGGGAGGCTmGAGTCrcAT^ 

Smal (-220) 
-222 TCCCGGGCX»3GTTTTCrrcGlXX^ 

-160 TTTGGCTTTCriGGACGTICTAC^^ 

AUG (1) 

-36 CTTCTCHTATCnxriCT ATS AQG QQ£ AAG CTC 

19 OE GGG GCG CTG CTG GCA CES S2S GCC OA CTG CAG GGG GCC GTG 
64 TC£ CTC AAG ATC GCA GCC TTC &&C ATC CAG TTT GGG GAG ACC 

109 AAS ATS TCC MX SCC ACC CTC SEC ACC 2&C an; £TS CAS ATC CTC 

154 ACC CSC Tai SAC ATC CCC CTC CTC CAC S&S CTC ASA SAC ACC CAC 

199 CTG ACT GCC GTG GGG AAG CTC CDS SAC AAC CTC AAT CAS SAT CCA 

244 CCA SAC ACC TAT CAC TAC GTC GTC AGT GAG CCA CTC GGA OGG AAC 

289 AGC TAT MS GAG CGC TAC CTG TTC GTC TAC AG£ CCT GAC CAG GTC 

334 TCT GCG £ 
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-8711 AGCTOCTCXTTITOGGA^^ 

-8646 AAAAAAAAAAAAGAAATAAAAATTAGAGCAGAAATC^ 
-8581 GAAAATCAACATAAAAAGTCTGGTTCTIGAAAAGAT^ 
-8516 TAATTAAGGAAAAAAGACAGAGGACACAGATTACTA 
-8451 GCAAAl"lU*iATAGGCATTCAAAGCGTAATAAAAG 
-8386 TGATAAGTAAATAGAATGAACCAATIX^^ 
-8321 TAAACAATCTGAATAGCCTATATCTTATO 

EcoRI (-8223) 

-8256 AGGAAGCACAATGCCCAGATGGGTTCACTAGTGAATl^ 
-8191 GTATCAACTTTCTACAATCTCTITCAGAAGACAG^ 
-8126 CTAGGCCAGCATTACCTTAATACCGGAACTAGAA^ 
-8061 CAATATCTCTCATCAACAAAGATACAAACATriTt^ 
-7996 TGTATCAAAAAATATACACCACAACCAAGTAGAATTTAT^ 
-7931 TTTGAAAATCAATTAACGTAATTTC^ 
-7866 TGATAGACACAGAAAAAGCATTTGACAAAATTTAACAC<^ 
-7801 CTAGGAATAGAGGAAAACTTCCTCAGCTTGAA 
-7736 AACTCCTCTTAAAAAATAAAGTTTTTCATTTAAAAAG 
-7671 GTATCTCATTITAGACC^TCAGCTATGGATAGTO 
-7606 TGTTTCTGGCAATGTTCCAGACTACATTTAAAA^^ 
-7541 AAGAAAAATATGAAAATGCTTTGCCGTGTTAATGCTACT 
- 747 6 ACTITATTTATATTTCATTA G^ 
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-7411 ATGCCACATTACATATAATTCTCATC 

-7346 1TTTCTTMTITTTGMGACCTTCACAG 

-7281 TAAAGTATATTTGTCATGATTTATACTGGGTAAGGGTTT^ 

-7216 TCTCATCACATCATATCAAGTTATATACCAT^ 

-7151 ATTTX^CTRATTTAGTCTATAT^ 

-7086 TAATC^TTATTTAGAGTTTCT 

-7021 ItnAAGA^iwxxi-i-jAT^ 

-6956 CTCTCATTCTATGGCCTGACTT^^ 

-6891 TGCAATCTAATTAACAATCTTTTCTTTC^^ 

-6826 ACTGAAGTCATGATGGCATGCTTCTATATTATTT^ 

-6761 TTAGACTTATAATTO^GTGG ^ 

-6696 TTTACATATAAATATATTICCCTGTTTTTC^ 

-6631 AATGCCATATTTTTTTCATAGGTCACTTAC^ 

-6566 TTTATCAGCCTC^CTGTCTATC 

-6501 AAC^TTCTrTCCCATl'lUXi'lUCTACAAGAATA l ^ 

-6436 TTTTAGAATGAGGTTGGCAAGTTAACAAACAGCTTT^ 

-6371 ATGTGAAAAGAAAGTATACClTXZACAATATTAAGTCTr^ 

8glll (-6248) 

-6306 CGTITCTOCATTAACITAGACATTCATTAA 

-6241 ATTCATITAAATCTTCACTAACCTCTCATra 

-6176 CTTCTTTGCCTAGATTTATTTCCAAGTA 
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-6111 AATCTAATTTTTCACCTT^^ 

EcoRI (-6032) 
~ 6046 TOGCGACXriTCXrn^ 

-5981 AGATCATCTGCATATA^^ 

-5916 AAGAAAATATCCAAATGGTCAATAAACATATGAAA^ 
-5851 ATGCAAATTAAACTATAATGAAGTATTATTCT^ 
-5786 ACAATATCAAAGTTGGCAAGAGTCTC^TAC^ 
-5721 AAATTGGTAC^^CATTTGGGAAGTCAT^ 
-5656 CTATGAGCCAGTTACTTCATTCTAGGCAT^ 
-5591 AATACAGACAAGGAATTOGATAGGAGCATTAA 
-5526 ACTTlGAAGGGATAAAACATI^^ 

-5461 AAACTATACACACAAGATAGAOGAATTICGCAGACAT^ 

-5396 CAAAGCTCAAAAACAGACAGAATCTAGAGTC7IT 

-5331 AAACTAGTGACGAGAGAGAGGAGAGAGAATAATGATTO^ 

-5266 TCCCX^CAAATTTCACATGTTAAAACCTAATCC^ 

•5201 GTGGATAATTAGGTAATGGAAC^GAGCCCTAACAAA 

5136 GAGCCTCAGGGACCTTGTTTCCCGCTTC^ 

5071 CMXXAGCCCTCAT^^ 

5006 CTATAAAAAGAAATGCTTGTTGTTTAAAAGGCAT^ 
4941 CAAGAGAC1TAAGAGGGAACAAGAGGGCGATTTCTGTTG1 
4876 CAAAGAGTCCAGACGTTTITATTTTATAAC^ 
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EC Rl (-4759) 

-4811 TTITCTATglATATra^ 

-4746 ATCTATTAATCTCTTATGAAAG^^ 

-4681 CTTCATTTATCTTCTCX^ 

-4616 TTCTTTGATAGGGACCCTCTICC^^ 

-4551 ACTAAATCTTTATITCT^ 

-4486 AATTGGCTCCTATCTXnXSAAATTTATAGA^ 

-4421 GAAATGTCATTCAAGTTrTAC'lUUXJ'X'AAATg 

-4356 GAACTGGTGC^GGGACTCGAAGTAGTTTT^^ 
-4291 TCTGTGCAAAAATAACGTCCACAGAAGG^ 
-4226 GGGCACTAACCCTTACAATGCAGATACACACTO 
-4161 CAGAAGGTTAAATAAATTTTCCTCGTTATIC^ 
-4096 TAAAACTTAAAATGATCHATTTAAAAGGAAGAAA 
-4031 AGATTACTACTAATCCI^^ 

-3966 GTAOTAGGAAGCACCTCAAGAACACAATAGCAGGAAG^^ 

-3901 GAAAAAAAAAATOCCTTTTra 

-3836 CTITATTTTCACCCTCCACAGCCATGA^ 

-3771 CCAATCACCTCTAACATTTCTGCCT^ 

-3706 CAAAGACCTCTTGAATTAAGTCCAAATGCTACACT^ 

-3641 CCTGACTITTCCACCCTCAC^ 



FIGURE IAD 



WO 96/29411 PCT/US96/03377 

25/30 

- 3 5 11 AGCrerKSGATATCATCC^ 

-3446 A'l'i'i'iATA'lVj^IACT 

-3381 AGA.CTCTACACAAAATTTAA1TATCTCA 

-3316 TATTTTGGATATACTATGCTAAATAAAACATATTATT^ 

-3251 CTTIX^UVTATGGCTACTAGAGCTTTTTAAA 

-3186 AATGCCCTCAACCACATOVCCTC^ 

-3121 GGCACACTXXX^GCATT&AGGGa^^ 

-3056 TTTCTTTGAGAGCCATCAT^ 

-2991 AGACTGCTTGATATTCTACAGGAAAGATCAC^ 

-2926 TGTGTATCTTTCACACATrACACAGCCTC^ 

-2861 GATAATAAGCCATCTCAAATGTTTACT^^ 

-2796 AATAAATGATAACTAGTACTACCXSCCACTACTGTTff 

-2731 AAGGACCATTTCCGGATGGAGGATAAGAGACCATTTGATGTC 

- 2 6 6 6 CACCTGGAAAGGTCAACTATATACTUVGCCTOCAAGT^ 

-2601 GACTCTATAGACTGTCTCCTCTTTCCTGAGAGGGAC^ 

-253 6 GCTCCITCCATTGGCTTTTC 

-2471 CAAAACCCCAAGGAATTACTCAAATACTGACATAACAGA 
-2406 TTITTAATATTCTGA^CTCATTGTTTCT 

-2341 GGCCTGCAAAGCGAAAGGCAGAGAGAATGAAACCCATAGAGAGGCAG 
-2276 GACTCGTTTATTTTATAATGTAAATTAGTCTAT^ 
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-2211 TCGAAAATACAAAGAATAAAAGGAGGAA 

-2146 ACTATTAAAATGGTGGTITACI^^ 

-2081 GTATCTACAATITTATGTTCT^ 

-2016 AAATAATATATGCTCATAATAGAACATTTT^^ 

-1951 GTAATATTTATTAAATITTCTCCAAG^ 

-1886 GCCTAATAACXTCTATTICCAGAC^^ 

-1821 AAGGTATGAAGTIXSAAAAGATAAAGATTTI^^^ 

-1756 COCCAGGGTAACTACTATTAATAGATAGTAATTC^ 

-1691 AGCATCATATglATACCriU^ 

Pvull (-1580) 

-1626 TGTATTGCTCTTTTCACTAAATCTATC^^ 

-1561 TOGCTGAATAATATIXXMK^^ 

-1496 ATTIXnCTTTCTITAOT 

-1431 TACACATGCACATACACATGCATATTTCTGCAGGGAT^ 

-1366 TGCAAGTTAAAGGAAO^TCrCATTGC^^ 

-1301 TGGTCTCTCCTTGTAAGCTAGTTTO 

-1236 TCCTGGCCAAAGAGCAGAGTGCCACAGACCACAACT 

-1171 

-1106 GTAAC^GCTTATTTTTCTGAACCAGGA^ 
-1041 CTCTTCTGTTAGCTTT^^ 
-976 ACCCTGGTIX3GGCCITCTCTATC 
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-779 GGCAAAACTCCTTGCAGTTTCAGCTACT^ 

-713 CTTTCAGTATCCAAAGAAGATIXSGTT^ 

-647 AGAGGATGCTCAATTCCCTCTTIATAAAA 

-581 GTATATTTTAAATCATCCCTAGATTACTTA 

-515 ACACTGmTCTTTAAAATTTACATT^ 

-449 AAATATTTTCC^TCTACAGTCAGTAGAATC 

- 3 83 GTATCTTTTAGTCTTTTGftGCTTCTTG 
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-356 AATTCItaGGTOGTTTXXHTT^ 
-291 TTCACTGAAACTTTAAAAAACATTAGAAAACCT^ 
-226 TATCATAAGATAGGAGCTTAAATAAAGAGTTTTAGAAACTACT 
-161 ACTGAAAGGGAGAAGTGAAAGTGGGAAATTCCTC^^ 
CAP (-81) 

-96 GGC^TACCC^C(^GAAAG^ 

AUG (1) 

-31 GGTAGTAGGCGACACTGTTCOTGTO AE£ &CC AAC AAG XEX d£ CE C&A 

25 AXE OCX CEC CIS US ISC PS XCC Ad ACA GCT CTT tcc ATC agc tac 
73 AAG TTC CTT QGA XEC CIA CAA AGA AGC AGC AAT TTT CAG TGT CAG AAG 

121 CTC CIG TOG CAA US AAT QQG AGG CTT GAA TAC TCC CTC AAG GAC AOS 

Pvull (199) 

169 A2S. AAC TIT GAC A2S PPT GfcG fiaS ATT AAG CfcQ CTG CAG TTC CAG 

217 &AG GfiS GAC QCC GCA US ACC A2C X&X G£G AXG CXC CAS AAC AXC TTT 
265 QCX ATX TTC AGA £&A SAX 3£A XCX AGC ACX SOS TGG AAT GAG ACT AXE 
313 GTT GAG AAC CTC CXS GCT AAT GTC XBX CAE CAG ATA AAC CAT CTG AAG 
361 ACA CTC CTG GAA G&A MA CIS GAS AAA GAA GAT TTC ACC AGG GGA AAA 
409 CXC &X£ AGC ASX CIS CSC CIS AAA AGA X&X X&E GGG AGG ATT CTG CAT 
457 2AC CIS AAG GCC &&G G&G 3AC &GX CAC XGX GCC XGG ACC AIA SEC AGA 
505 GXG GAA AXC CXA AGG AAC XXX XfcC XXC AXE A&C AGA CXE AGA GSX 3AG 
Bglll (565) 

553 OS CGA TCAAGATCTCCTAGCCTGTC^ 

615 TCAACXTAGCAGATCCTGTTTAAGTGACTGATCGCTAATC 
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680 mXESAAATlTTT^^ 

745 TTAl'ri^-iGGTGCAAAAGTCAACATCGCAGTl^ 
810 TTATAAAATTCCOAGTACCrATTAG T 

940 CAATAAGGGGACCTGAACCTTATaSGCK^TAAATA 
1005 AAAAGGAAAGCTGGAGGGTCTGGAACTAAACC^^ 

Ball (1099) 

1070 ATTCTCTCATCATAAAGTTAGAATTGAGCTC^ 
1135 TTCTCTIXnCCCT 

1200 TAATTATXnXXXXXXX^CCATCCCTGCAA 
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